Snowpark-Optimized Warehouses: Production-Ready ML Training and Other Memory-Intensive Operations

With Snowpark, our customers have begun to leverage Snowflake for more complex data engineering and data science workloads using languages such as Java and Python. This new wave of developers using Snowflake often requires more flexibility in the underlying compute infrastructure to unlock memory-intensive operations on large data sets such as ML training.

To support these workloads in production, we’re excited to launch Snowpark-optimized warehouses in general availability in all Snowflake regions across AWS, Azure, and GCP. 

Demo: Running 200 forecasts in 10 minutes using XGBoost and Snowpark-optimized warehouses

Snowpark-optimized warehouses have compute nodes with 16x the memory and 10x the local cache compared with standard warehouses. The larger memory helps unlock memory-intensive use cases on large data sets such as ML training, ML inference, data exports from object storage, and other memory-intensive analytics that could not previously be accommodated in standard warehouses. 

As a result, data teams can now run end-to-end ML pipelines in Snowflake in a fully managed manner without having to use additional systems or move data across governance boundaries.


Snowpark-optimized warehouses also inherit all the benefits of Snowflake virtual warehouses:

  • Fully managed: Snowflake oversees the maintenance, security patching, tuning, and delivery of the latest performance enhancements transparently
  • Elastic: Elastic scaling of compute supports virtually any number of users, jobs, or data with multi-tenant security and resource isolation
  • Reliable: Industry-leading SLA is consistently upheld
  • Secure: Governance controls are applied across all workload without trade-offs

Since the new warehouse option was announced in public preview in November 2022, we’ve rolled out performance improvements, increased region availability, and made behind-the-scenes stability improvements.

The 10x larger local cache on each Snowpark-optimized warehouse node helps accelerate subsequent run execution through speedups when cached artifacts (Python packages, JARs, intermediate results, etc.) are reused across runs. With these performance improvements, Snowpark developers continue to get more out of each compute credit and more efficiently process large data sets. We have also invested in improving the performance of the most popular Python libraries by adding Joblib multiprocessing support in Snowpark for Python-stored procedures.

In addition to unlocking single-node ML training use cases, Snowpark-optimized warehouses also include optimizations for multi-node use cases. When UDFs are run on a warehouse with multiple nodes (size L or larger), Snowflake will leverage the full power of the warehouse by parallelizing computations through redistribution of rows between nodes in the warehouse. Statistics on UDF execution progress are used to optimize the distribution of work among compute nodes to optimize parallelism.

Bestseller No. 1
SAMSUNG Galaxy A54 5G A Series Cell Phone, Unlocked Android Smartphone, 128GB, 6.4” Fluid Display Screen, Pro Grade Camera, Long Battery Life, Refined Design, US Version, 2023, Awesome Black
  • CRISP DETAIL, CLEAR DISPLAY: Enjoy binge-watching...
  • PRO SHOTS WITH EASE: Brilliant sunrises, awesome...
  • CHARGE UP AND CHARGE ON: Always be ready for an...
  • POWERFUL 5G PERFORMANCE: Do what you love most —...
  • NEW LOOK, ADDED DURABILITY: Galaxy A54 5G is...
Bestseller No. 2
OnePlus 12,16GB RAM+512GB,Dual-SIM,Unlocked Android Smartphone,Supports 50W Wireless Charging,Latest Mobile Processor,Advanced Hasselblad Camera,5400 mAh Battery,2024,Flowy Emerald
  • Free 6 months of Google One and 3 months of...
  • Pure Performance: The OnePlus 12 is powered by the...
  • Brilliant Display: The OnePlus 12 has a stunning...
  • Powered by Trinity Engine: The OnePlus 12's...
  • Powerful, Versatile Camera: Explore the new 4th...

Last update on 2024-04-05 / Affiliate links / Images from Amazon Product Advertising API

Since moving to public preview, we have seen the adoption of a variety of memory-intensive use cases by customers such as Spring Oaks Capital and Innovid.

Customer success stories

Spring Oaks Capital is a national financial technology company that focuses on the acquisition of consumer credit portfolios. The data science team evaluates millions of records to provide predictions that give their team the insights needed to optimize their debt pricing and purchasing strategies. One of their machine learning models runs every morning to provide call centers with prioritized call lists based on expected conversion. 

To ensure the highest levels of productivity with the latest set of features, Spring Oaks needs to compute large amounts of feature data reliably every morning. Watch an overview of the architecture that has given Spring Oaks 8x performance over the prior solution. 

Innovid, which powers advertising delivery, personalization, and measurement for the world’s largest brands, has also been using Snowpark-optimized warehouses. Innovid collects approximately 6 billion data points from over 1 billion ads each day. Using Snowpark-optimized warehouses, the data science team is able to process these very large data sets and train ML models to provide sophisticated solutions in cross-platform ad serving, data-driven creative, and converged TV measurements for their global client base. Read more about Innovid’s experience using Snowpark for ML.

How to get started

New
Fadnou I23 Ultra Unlocked Cell Phone,Built in Pen,Smartphone Battery 6800mAh 6.8" HD Screen Unlocked Phones,6+256GB Android13 with 128G Memory Card,Face ID/Fingerprint Lock/GPS (Purple)
  • 【Octa-Core CPU + 128GB Expandable TF Card】...
  • 【6.8 HD+ Android 13.0】 This is an Android Cell...
  • 【Dual SIM and Global Band 5G Phone】The machine...
  • 【6800mAh Long lasting battery】With the 6800mAh...
  • 【Business Services】The main additional...
New
Huness I15 Pro MAX Smartphone Unlocked Cell Phone,Battery 6800mAh 6.8 HD Screen Unlocked Phone,6+256GB Android 13 with 128GB Memory Card,Dual SIM/5G/Fingerprint Lock/Face ID (Black, 6+256)
  • 【Dimensity 9000 CPU + 128GB Expandable TF...
  • 【6.8 HD+ Android 13.0】 This is an Android Cell...
  • 【Dual SIM and Global Band 5G Phone】Dual SIM &...
  • 【6800mAh Long lasting battery】The I15 Pro MAX...
  • 【Business Services】The main additional...
New
Jopuzia U24 Ultra Unlocked Cell Phone, 5G Smartphone with S Pen, 8GB+256GB Full Netcom Unlocked Phone, 6800mAh Battery 6.8" FHD+ Display 120Hz 80MP Camera, GPS/Face ID/Dual SIM Phone (Rose Gold)
  • 🥇【6.8" HD Unlocked Android Phones】Please...
  • 💗【Octa-Core CPU+ 256GB Storage】U24 Ultra...
  • 💗【Support Global Band 5G Dual SIM】U24 Ultra...
  • 💗【80MP Professional Photography】The U24...
  • 💗【6800mAh Long Lasting Battery】With the...

Last update on 2024-04-05 / Affiliate links / Images from Amazon Product Advertising API

You can get started with Snowpark-optimized warehouses by following usage instructions in our documentation and quickstart guide, which includes step-by-step setup instructions and product details. We’re continuously looking for ways to improve, so if you have any questions or feedback about the product, make sure to let us know in the Snowflake Forums community

The post <strong>Snowpark-Optimized Warehouses: Production-Ready ML Training and Other Memory-Intensive Operations</strong> appeared first on Snowflake.

Original Post>