Typically, data models rely on DDLs to generate ER diagrams. But there's a catch: without primary keys and foreign keys, you’re left with an incomplete view.
That's where we come in.
"You might have to deal with many-to-many relationships or have a complex transactional data model with non-standard naming conventions forcing you to join fields like employee_id to person_id (which you wouldn’t know if the Sr. Engineer hadn’t told you).", SeattleDataGuy
Here's how we take it further:
Instead of just using DDLs, we also analyze SQL joins.
By examining how tables are joined in SQL queries, we can infer relationships and automatically populate the foreign keys and primary keys in your data model.
Examples
Example 1 - SaaS App
While existing data modeling tools typically rely solely on DDLs to create ER diagram, we automatically map out PK & FK keys from SQL JOINs.
Code
WITH tag_engagements AS (SELECT t.tag_name, pt.photo_id,COUNT(l.user_id) AS likes,COUNT(c.id) AS commentsFROM tags tJOIN photo_tags pt ON t.id = pt.tag_idLEFT JOIN likes l ON pt.photo_id = l.photo_idLEFT JOIN comments c ON pt.photo_id = c.photo_idGROUP BY t.tag_name, pt.photo_id)SELECT tag_name,AVG(likes + comments) AS avg_engagement,COUNT(*) AS hashtag_usageFROM tag_engagementsGROUP BY tag_nameORDER BY avg_engagement DESC;
Example 2 - Financial
It works with any non-id fields (e.g., ON t.currency = e.currency AND t.transaction_date = e.date;), as long as the query contains JOIN statements.
Code
CREATETABLEnormalized_transactionsASSELECT t.*,CASEWHEN t.currency ='USD'THEN t.amountELSE t.amount * e.exchange_rateENDAS amount_usdFROM clean_transactions tLEFT JOIN exchange_rates e ON t.currency = e.currency AND t.transaction_date = e.date;CREATETABLEmonthly_financial_statementsASSELECT DATE_TRUNC('month', posting_date) ASmonth, a.account_type,SUM(CASEWHEN a.account_type IN ('Asset', 'Expense') THEN daily_net_change ELSE-daily_net_change END) AS net_change,SUM(CASEWHEN a.account_type IN ('Asset', 'Expense') THEN running_balance ELSE-running_balance END) AS end_balanceFROM daily_account_balances bJOIN accounts a ON b.account_id = a.account_idGROUP BY DATE_TRUNC('month', posting_date), a.account_type;
Example 3 - SaaS App v2
Code
-- Total posts by usersSELECTSUM(user_posts.total_posts_per_user)FROM (SELECT users.username,COUNT(photos.image_url) AS total_posts_per_userFROM usersJOIN photos ON users.id = photos.user_idGROUP BY users.id) AS user_posts;-- User ranking by postingsSELECT users.username, COUNT(photos.image_url)FROM usersJOIN photos ON users.id = photos.user_idGROUP BY users.idORDER BY2DESC;-- User segments based on their followings and the tagsWITH user_segments AS (SELECT u.id AS user_id,COUNT(DISTINCT f.followee_id) AS following_count,COUNT(DISTINCT pt.tag_id) AS tag_countFROM users uLEFT JOIN follows f ON u.id = f.follower_idLEFT JOIN photos p ON u.id = p.user_idLEFT JOIN photo_tags pt ON p.id = pt.photo_idGROUP BY u.id)SELECT user_id,CASEWHEN following_count >=1000AND tag_count >=100THEN'Influencers'WHEN following_count >=500AND tag_count >=50THEN'Active Users'WHEN following_count >=100AND tag_count >=30THEN'Engaged Users'ELSE'Casual Users'ENDAS user_segmentFROM user_segments;-- Which hashtags are most effective in driving engagements?WITH tag_engagements AS (SELECT t.tag_name, pt.photo_id,COUNT(l.user_id) AS likes,COUNT(c.id) AS commentsFROM tags tJOIN photo_tags pt ON t.id = pt.tag_idLEFT JOIN likes l ON pt.photo_id = l.photo_idLEFT JOIN comments c ON pt.photo_id = c.photo_idGROUP BY t.tag_name, pt.photo_id)SELECT tag_name,AVG(likes + comments) AS avg_engagement,COUNT(*) AS hashtag_usageFROM tag_engagementsGROUP BY tag_nameORDER BY avg_engagement DESC;