Get your architecture ready for your big data use case. For your use case Datoop will get you from Concept to actual Deployment of Solution.
Datoop provides services to help you create Hadoop Cluster, integrate analytics and ETL with hadoop cluster, implementing big data in data science problems, implementing security, searching and indexing, logs management, and nosql databases.
Datoop helps industries to get maximum returns from their data through a fitting architecture for the domain use case.
1 Week
- Architect a hadoop cluster -
Install or upgrade Big Data Suite on upto 100 nodes across one or two clusters.
Review existing hadoop cluster and related applications.
Recommend performance tuning, data compression and scheduler configuration.
Finalize the environment for successful implementation of Hadoop Cluster.
Document the recommended configuration for the Hadoop Cluster.
2 Weeks
- Customize Data Pipeline -
Identify solution requirements to include data sources, transformation and egress points.
Architect & develop pilot implementation for upto 3 data sources, file transformations & one target system.
Develop a deployment architecture that will result in a production deployment plan.
Review the hadoop cluster & application configuration.
Document the system recommendations.
2 Weeks
- Analyze with Hadoop System -
Review use case requirements & existing hardwares and recommend changes.
Design & develop a process for loading data from upto 2 data sources.
Design & implement a data storage, schema, and partitioning system.
Design & prototype a data integration process.
Design & implement specific data processing jobs and document the solution.
1 Week
- Authenticate and Authorize Access -
Review security requirements & provide an overview of data security policies.
Audit architecture & systems in light of security policies & best practices.
Install & integrate local MIT Kerberos KDC with active directory.
Review security integration for users & administrators.
Document administration & control features in applicable components.
4 Weeks
- Timeline from Conceptualization to Production -
Review cluster architecture, ingestion pipeline, schema & data partitioning system.
Review data jobs or analytic processes, & review data serving & result publishing systems.
Recommend performance tuning, data compression & scheduler configuration.
Document the configuration, review operation team's skills.
Review management and monitoring processes & production procedures.
© Copyright 2014 by Datoop
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.