The One Big Table (OBT) data modeling approach consists of storing all relevant data in a single large denormalized table rather than distributing it across multiple tables, which simplifies data models and makes them easier to use. We use OBT to simplify data access, which is beneficial for fast analysis or for less technical teams, as they do not have to manage the complexity of joins or relationships between tables.
With the rise of LLMs (Large Language Models), we use the OBT approach to efficiently process large datasets, particularly facilitating Text-to-SQL (see page 68) by simplifying the queries to be created.
The redundancies caused by denormalization are not a major issue when using column-oriented databases such as Snowflake and BigQuery, and the low-cost storage allows a focus on reducing compute costs.
However, the maintenance drawback lies in managing a single massive table, which can become complex, especially when data changes frequently or when new sources are added. Denormalization can also lead to inconsistencies if not managed carefully, requiring the implementation of robust processes to ensure data consistency.
Theodo’s point of view
We recommend carefully considering the long-term impact: maintenance and data quality challenges can become complex. One Big Table can be combined with normalized structures depending on the use case. This allows benefiting from ease of access while maintaining the flexibility and manageability offered by a traditional relational database.
MDN’s point of view
One Big Table simplifies data access to make business teams autonomous. It is an essential solution to improve the adoption and use of the semantic layer (gold layer). By choosing OBTs, joins are avoided, and indicators are computed at the finest level of granularity necessary for their consumption.
Lorem ipsum dolor sit amet consectetur. Eu tristique a enim ut eros sed enim facilisis. Enim curabitur ullamcorper morbi ultrices tincidunt. Risus tristique posuere faucibus lacus semper.
En savoir plus