Typically, data models rely on DDLs to generate ER diagrams. But there's a catch: without primary keys and foreign keys, you’re left with an incomplete view.
That's where we come in.
"You might have to deal with many-to-many relationships or have a complex transactional data model with non-standard naming conventions forcing you to join fields like employee_id to person_id (which you wouldn’t know if the Sr. Engineer hadn’t told you).", SeattleDataGuy
Here's how we take it further:
Instead of just using DDLs, we also analyze SQL joins.
By examining how tables are joined in SQL queries, we can infer relationships and automatically populate the foreign keys and primary keys in your data model.
Examples
Example 1 - SaaS App
While existing data modeling tools typically rely solely on DDLs to create ER diagram, we automatically map out PK & FK keys from SQL JOINs.
Code
WITH tag_engagements AS ( SELECT t.tag_name, pt.photo_id, COUNT(l.user_id) AS likes, COUNT(c.id) AS comments FROM tags t JOIN photo_tags pt ON t.id = pt.tag_id LEFT JOIN likes l ON pt.photo_id = l.photo_id LEFT JOIN comments c ON pt.photo_id = c.photo_id GROUP BY t.tag_name, pt.photo_id)SELECT tag_name, AVG(likes + comments) AS avg_engagement, COUNT(*) AS hashtag_usageFROM tag_engagementsGROUP BY tag_nameORDER BY avg_engagement DESC;
Example 2 - Financial
It works with any non-id fields (e.g., ON t.currency = e.currency AND t.transaction_date = e.date;), as long as the query contains JOIN statements.
Code
CREATE TABLE normalized_transactions ASSELECT t.*, CASE WHEN t.currency ='USD' THEN t.amount ELSE t.amount * e.exchange_rate END AS amount_usdFROM clean_transactions tLEFT JOIN exchange_rates e ON t.currency = e.currency AND t.transaction_date = e.date;CREATE TABLE monthly_financial_statements ASSELECT DATE_TRUNC('month', posting_date) ASmonth, a.account_type, SUM(CASE WHEN a.account_type IN ('Asset', 'Expense') THEN daily_net_change ELSE -daily_net_change END) AS net_change, SUM(CASE WHEN a.account_type IN ('Asset', 'Expense') THEN running_balance ELSE -running_balance END) AS end_balanceFROM daily_account_balances bJOIN accounts a ON b.account_id = a.account_idGROUP BY DATE_TRUNC('month', posting_date), a.account_type;
Example 3 - SaaS App v2
Code
-- Total posts by usersSELECT SUM(user_posts.total_posts_per_user)FROM (SELECT users.username,COUNT(photos.image_url) AS total_posts_per_user FROM users JOIN photos ON users.id = photos.user_id GROUP BY users.id) AS user_posts;-- User ranking by postingsSELECT users.username, COUNT(photos.image_url)FROM usersJOIN photos ON users.id = photos.user_idGROUP BY users.idORDER BY 2DESC;-- User segments based on their followings and the tagsWITH user_segments AS ( SELECT u.id AS user_id, COUNT(DISTINCT f.followee_id) AS following_count, COUNT(DISTINCT pt.tag_id) AS tag_count FROM users u LEFT JOIN follows f ON u.id = f.follower_id LEFT JOIN photos p ON u.id = p.user_id LEFT JOIN photo_tags pt ON p.id = pt.photo_id GROUP BY u.id)SELECT user_id, CASE WHEN following_count >=1000 AND tag_count >=100 THEN 'Influencers' WHEN following_count >=500 AND tag_count >=50 THEN 'Active Users' WHEN following_count >=100 AND tag_count >=30 THEN 'Engaged Users' ELSE 'Casual Users' END AS user_segmentFROM user_segments;-- Which hashtags are most effective in driving engagements?WITH tag_engagements AS ( SELECT t.tag_name, pt.photo_id, COUNT(l.user_id) AS likes, COUNT(c.id) AS comments FROM tags t JOIN photo_tags pt ON t.id = pt.tag_id LEFT JOIN likes l ON pt.photo_id = l.photo_id LEFT JOIN comments c ON pt.photo_id = c.photo_id GROUP BY t.tag_name, pt.photo_id)SELECT tag_name, AVG(likes + comments) AS avg_engagement, COUNT(*) AS hashtag_usageFROM tag_engagementsGROUP BY tag_nameORDER BY avg_engagement DESC;