Building Google-­in-­a-­box: Using Apache SolrCloud and Bigtop to index your bigdata

von Roman Shaposhnik (Pivotal Inc.)

You’ve got your Hadoop cluster, you’ve got your petabytes of unstructured data, you run mapreduce jobs and SQL-on-Hadoop queries. Something is still missing though. After all, we are not expected to enter SQL queries while looking for information on the web. Altavista and Google solved it for us ages ago. Why are we still requiring SQL or Java certification from our enterprise bigdata users? In this talk, we will look into how integration of SolrCloud into Apache Bigtop is now enabling building bigdata indexing solutions and ingest pipelines. We will dive into the details of integrating full-text search into the lifecycle of your bigdata management applications and exposing the power of Google-in-a-box to all enterprise users, not just a chosen few data scientists.

Über den Autor Roman Shaposhnik:

Roman Shaposhnik is a graduate of St. Petersburg State University who lives in California. He had been working at Sun microsystems for 11 years until it was sold to Oracle back in 2009. Since then he's been spending his time on bigdata and Cloud computing projects working for a number of companies: Huawei, Yahoo!, Cloudera and now Pivotal. By day, he is a jack of all trades in Hadoop and its ecosystem projects, keeping his employers honest when it comes to Open Source agenda and the “Apache Way”. By night, he is an open source hacker, ASF IPMC member and VP of Apache Bigtop.