Emerging Trends in Big Data Technologies |
- Storm: Apache Storm is an open source distributed real-time computation system. Storm makes it easy to process streams of data, doing for real-time processing what Hadoop did for batch processing.
- Spark: Spark is an in-memory data-processing platform that is compatible with Hadoop data sources but runs much faster than Hadoop MapReduce. It’s well suited for machine learning jobs, as well as interactive data queries, and is easier for many developers because it includes APIs in Scala, Python and Java.
- Apache Hive: Apache Hive facilitates querying and managing large datasets residing in distributed storage. It also allows the map reduce programmers to plug in custom mappers and reducers.
- Apache Tajo: Apache Tajo is a big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.
- Twitter's Summingbird