Skip to main content
NetApp Solutions

Customer Use Cases

Contributors nkarthik

NetApp ActiveIQ Use Case

ActiveIQ old architecture

Challenge: NetApp's own internal Active IQ solution, initially designed for supporting numerous use cases, had evolved into a comprehensive offering for both internal users and customers. However, the underlying Hadoop/MapR-based backend infrastructure posed challenges around cost and performance, due to the rapid growth of data and the need for efficient data access. Scaling storage meant adding unnecessary computing resources, resulting in increased costs.

Additionally, managing the Hadoop cluster was time-consuming and required specialized expertise. Data performance and management issues further complicated the situation, with queries taking an average of 45 minutes and resource starvation due to misconfigurations. To address these challenges, NetApp sought an alternative to the existing legacy Hadoop environment and determined a new modern solution built on Dremio would reduce costs, decouple storage and compute, improve performance, simplify data management, offer fine-grained controls, and provide disaster recovery capabilities.

Solution:
ActiveIQ new architecture with dremio
Dremio enabled NetApp to modernize its Hadoop-based data infrastructure in a phased approach, providing a roadmap for unified analytics. Unlike other vendors that required significant changes to data processing, Dremio seamlessly integrated with existing pipelines, saving time and expenses during migration. By transitioning to a fully containerized environment, NetApp reduced management overhead, improved security, and enhanced resilience. Dremio's adoption of open ecosystems like Apache Iceberg and Arrow ensured future-proofing, transparency, and extensibility.

As a replacement for the Hadoop/Hive infrastructure, Dremio offered functionality for secondary use cases through the semantic layer. While the existing Spark-based ETL and data ingestion mechanisms remained, Dremio provided a unified access layer for easier data discovery and exploration without duplication. This approach significantly reduced data replication factors and decoupled storage and computing.

Benefits:
With Dremio, NetApp achieved significant cost reductions by minimizing compute consumption and disk space requirements in their data environments. The new Active IQ Data Lake is comprised of 8,900 tables holding 3 petabytes of data, compared to the previous infrastructure with over 7 petabytes. The migration to Dremio also involved transitioning from 33 mini-clusters and 4,000 cores to 16 executor nodes on Kubernetes clusters. Even with significant decreases in computing resources, NetApp experienced remarkable performance improvements. By directly accessing data through Dremio, query runtime decreased from 45 minutes to 2 minutes, resulting in 95% faster time to insights for predictive maintenance and optimization. The migration also yielded a more than 60% reduction in compute costs, more than 20 times faster queries, and more than 30% savings in total cost of ownership (TCO).

Auto Parts Sales Customer Use Case.

Challenges: Within this global auto parts sales company, executive and corporate financial planning and analysis groups were unable to get a consolidated view of sales reporting and were forced into reading the individual line of business sales metrics reports and attempting to consolidate them. This resulted in customers making decisions with data that was at least one day old. The lead times to get new analytics insights would typically take more than four weeks. Troubleshooting data pipelines would require even more time, adding an additional three days or more to the already long timeline. The slow report development process as well as report performance forced the analyst community to continually wait for data to process or load, rather than enabling them to find new businesses insights and drive new business behavior. These troubled environments were composed of numerous different databases for different lines of businesses, resulting in numerous data silos. The slow and fragmented environment complicated data governance as there were too many ways for analysts to come up with their own version of the truth versus a single source of truth. The approach cost over $1.9 million in data platform and people costs. Maintaining the legacy platform and filling data requests required seven Field Technical Engineers (FTEs) per year. With data requests growing, the data intelligence team could not scale the legacy environment to meet future needs

Solution: Cost-effectively store and manage large Iceberg tables in NetApp Object Store. Build data domains using Dremio's semantic layer, allowing business users to easily create, search, and share data products.

Benefits to customer:
• Improved and optimized existing data architecture and reduced time to insights from four weeks to just hours
• Reduced troubleshooting time from three days to only hours
• Decreased data platform and management costs by more than $380,000
• (2) FTEs of Data Intelligence effort saved per year