How To Spin Up a Hadoop Cluster with cloud servers

May 12, 2026
Linux
Comment off

This tutorial will cover setting up a Hadoop cluster on the cloud provider. The Hadoop software library is an Apache framework that lets you process large data sets in a distributed way across server clusters through leveraging basic programming models….

User Data Collection: Balancing Business Needs and User Privacy

May 12, 2026
Tutorials
Comment off

Collecting user data is common practice in modern sites and applications as a way of providing creators with more information to make decisions and create better experiences. Among other benefits, data can be used to help tailor content, drive product direction, and provide…

What is Big Data?

May 12, 2026
Tutorials
Comment off

Big data is a blanket term for the non-traditional strategies and technologies needed to organize, process, and gather insights from large datasets. Many users and organizations are turning to big data for certain types of workloads, and using it to supplement their existing analysis and business tools. Tools that exist in this space offer different options for interpolating data into a system, storing it, analyzing it, and working with it through visualizations.

An Introduction to Hadoop

May 12, 2026
Tutorials
Comment off

Apache Hadoop is one of the earliest and most influential open-source tools for storing and processing the massive amount of readily-available digital data that has accumulated with the rise of the World Wide Web. It evolved from a project called Nutch, which attempted to find a…

Apache Spark Example: Word Count Program in Java

May 12, 2026
Web Servers
Comment off

Apache Spark Apache Spark is an open source data processing framework which can perform analytic operations on Big Data in a distributed environment. It was an academic project in UC Berkley and was initially started by Matei Zaharia at UC Berkeley’s AMPLab in 2009. Apache Spark was created on top of a cluster management tool […]

Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared

May 12, 2026
Tutorials
Comment off

Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large datasets. While the problem of working with data that exceeds the computing power or storage of a single computer is not new,…

How To Install and Use ClickHouse on Debian 9

May 12, 2026
Linux
Comment off

ClickHouse is an open-source, column-oriented analytics database created by [Yandex](https://yandex.com) for OLAP and big data use cases. In this tutorial, you’ll install the ClickHouse database server and client on your machine. You’ll use the DBMS for typical tasks and optionally enable remote access from another server so that you’ll be able to connect to the database from another machine.

How To Install and Use ClickHouse on Ubuntu 20.04

May 12, 2026
Linux
Comment off

ClickHouse is an open source, column-oriented analytics database created by Yandex for OLAP and big data use cases. In this tutorial, you’ll install the ClickHouse database server and client on your machine. You’ll use the DBMS for typical tasks and optionally enable remote access from another server so that you’ll be able to connect to the database from another machine. Then you’ll test ClickHouse by modeling and querying example website-visit data.

How to Install Hadoop in Stand-Alone Mode on Debian 9

May 12, 2026
Linux
Comment off

In this tutorial, you’ll install Hadoop in stand-alone mode on a Debian 9 server. You’ll also run an example MapReduce program to search for occurrences of a regular expression in text files.

How to Install Hadoop in Stand-Alone Mode on Ubuntu 16.04

May 12, 2026
Linux
Comment off

Hadoop is a Java-based programming framework that supports the processing and storage of extremely large datasets on a cluster of inexpensive machines. It was the first major open source project in the big data playing field and is sponsored by the Apache Software Foundation. …

How To Spin Up a Hadoop Cluster with cloud servers

User Data Collection: Balancing Business Needs and User Privacy

What is Big Data?

An Introduction to Hadoop

Apache Spark Example: Word Count Program in Java

Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared

How To Install and Use ClickHouse on Debian 9

How To Install and Use ClickHouse on Ubuntu 20.04

How to Install Hadoop in Stand-Alone Mode on Debian 9

How to Install Hadoop in Stand-Alone Mode on Ubuntu 16.04

Links

Newsletter

Contact

Big data

How To Spin Up a Hadoop Cluster with cloud servers

User Data Collection: Balancing Business Needs and User Privacy

What is Big Data?

An Introduction to Hadoop

Apache Spark Example: Word Count Program in Java

Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared

How To Install and Use ClickHouse on Debian 9

How To Install and Use ClickHouse on Ubuntu 20.04

How to Install Hadoop in Stand-Alone Mode on Debian 9

How to Install Hadoop in Stand-Alone Mode on Ubuntu 16.04

Tags

Links

Newsletter

Contact