MySQL is becoming more and more popular as a database for data warehouse and business intelligence solutions, where big hardware and expensive commercial database have dominated the scene so far.
The use of MySQL in a data warehouse environment is in some way a collision of four worlds placed in two parallel universes. In one universe, Data Architects and DBAs know how to model and manage large multidimensional DBs, they know how tune queries when they can parallelize them or when they can manage a large amount of caching against large Storage Area Networks. In another universe, Developers and DBAs are used to query and manage lots of small boxes running MySQL against their web applications, looking for the simplest queries and smallest results. When OLAP architects try to use MySQL and Web Developers build a multidimensional DB, the results are often far from what is expected.
With a correct implementation, MySQL can be used to power very large and complex OLAP DBs, at a fraction of the cost of a commercial solution.
The objective of this presentation is to give some technical tips and insight hints on how to use MySQL for data warehouse solutions. We will present real examples on: - How to model a multidimensional DB for MySQL, diverging from the classic Kimballs methodology; - How to efficiently use storage engines, caches and indexes available for MySQL for OLAP solutions, compared to what is available for other RDBMSs; - How to query distributed DBs, combining the sharding approach with a multidimensional modelling.