Session 7B

Building and Managing FINRA’s 200 Petabyte Data Lake

← Back to Agenda

FINRA is dedicated to protecting investors and safeguarding market integrity. Fulfilling this mission requires ingesting, processing, and performing analytics on tremendous amounts of data. Over the past several years FINRA has built, grown, and maintained one of the largest financial Data Lakes in the world which now measures over 200 Petabytes in size.

This session describes the architectural principles and technical approach to building a massive cloud data lake; covers how the technology comes together to fulfill core data processing and analytics requirements; zooms in on the evolution of key analytics that are central to FINRA’s mission; illustrates the importance of metadata in a data lake; presents real-world statistics on growth over time in terms of data volume, compute resources, and storage footprint; includes in-depth stories about scaling the technology and managing costs, and touches on how FINRA achieves security and compliance in the cloud.


aaron carreras

VP of Data Management & Transparency Services Technology, FINRA

nate weisz

Sr. Director of Data Management, FINRA



View our complete list of 2021 speakers here.