Spark Machine Learning Example

Spark Machine Learning Application Machine Learning application using classification technique, specifically collaborative filtering method, to predict the movies to recommend to a user based on other users’ ratings on different movies. Our recommendation engine solution will use Alternating Least Squares (ALS) machine learning algorithm. Even though the data sets used in the code example in…

Spark Streaming Example

Spark Streaming Application This example illustrates a web server log analytics use case to show how Spark Streaming can help with running analytics on data streams that are generated in a continuous manner. These log messages are considered time series data, which is defined as a sequence of data points consisting of successive measurements captured…

Hadoop File Formats

Hadoop can store and process unstructured data like video, text, etc. This is a significant advantage for Hadoop. However, there are many, perhaps more, uses of Hadoop with structured or “flexibly structured” data, meaning it can have fields added, changed or removed over time and even vary amongst concurrently ingested records. Many of the file format…

Java Collections – HashMap and HashSet

HashMap and HashSet are collections in Java which are most commonly used. Here we summarise key features: HashMap   HashSet Duplicates Yes, Duplicate values but  no duplicate keys No Adding or Storing mechanism Hashing technique HashMap object Dummy values No Yes Implements Map Set Number Of Objects required during add operation 2 1 When to…

Alfresco Services Public Java APIs

Public Java API services The Public Java API provides access to Alfresco through a number of services that are exposed. These services are accessed via a single point of access – the Service Registry. This information provides an overview of the services exposed by the Public Java API. The following table summarizes the main services…