Making a better twitter recommendation engine. Part(1) Scraping Following's Followings.

Apr 20, 2023

I have a simple need. Show me tweets of people who are followed by the accounts I follow (but I don't). I believe this would create a much better "For You" page than whatever shit elon's been cooking in there (seriously.... the for you page is so bad its honestly appalling). I'll add momentum based recommendation (trending) later once I have a system to reliably get the tweets of the people who I follow, follow (if you get what I mean 😅)

So as for part 1 lets make a twitter scrape that scrapes all the handles of accounts a person follows. To help me make sense of this clusterfuck of an API I'm gonna use the divine powers of GPT4 .

Its honestly pretty surprising how good is gpt, at understanding the graph api twitter uses. It reduced the time to build this thing from like a weekend to like a couple of hours.Freakin magical.

This is the request twitter makes to make the initial fetch of the follower list :

Looking closely these are the variables encoded in the URL of the request :

{"userId":"183749519","count":20,"includePromotedContent":false}
{"blue_business_profile_image_shape_enabled":true,"responsive_web_graphql_exclude_directive_enabled":true,"verified_phone_label_enabled":true,"responsive_web_graphql_timeline_navigation_enabled":true,"responsive_web_graphql_skip_user_profile_image_extensions_enabled":false,"tweetypie_unmention_optimization_enabled":true,"vibe_api_enabled":true,"responsive_web_edit_tweet_api_enabled":true,"graphql_is_translatable_rweb_tweet_is_translatable_enabled":true,"view_counts_everywhere_api_enabled":true,"longform_notetweets_consumption_enabled":true,"tweet_awards_web_tipping_enabled":false,"freedom_of_speech_not_reach_fetch_enabled":false,"standardized_nudges_misinfo":true,"tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled":false,"interactive_text_enabled":true,"responsive_web_text_conversations_enabled":false,"longform_notetweets_rich_text_read_enabled":true,"responsive_web_enhance_cards_enabled":false}

In the second scroll of the list a variable called cursor gets added to the request which specified the sub-list of followers to fetch. This cursor is returned from the previous request as the timeline cursor.

{"userId":"183749519","count":20,"cursor":"1763689340909851626|1649001226504765398","includePromotedContent":false}

So to fetch all the list of followers of an account I just need to loop this request with different cursor till the beginning of cursor reaches 0. ie the request returns 0|xyz as the cursor value.

Now to build the parser I could study the whole response orrrrr I can just pass the resonse to GPT4 and ask it to build a parser for me 🤯.

write me python code to extract all the usernames (screen_name) and the pagination number (bottom from these type of json responses (don't rewrite the whole json text just give me a function):

The pagination code was Incorrect as i wasn't specific enough to what exact attribute to extract so I ask it to modify the code.

write me python code to extract the pagination number (bottom) eg: "value" "0|1648981762677669836""cursorType"."Bottom"}: I need "value" "0|1648981762677669836") from these type of ison responses (don't requrite the whole json text just give me a function):

All of this worked the first try OFC. Now the only thing left was to loop this shit till the end of cursor. For this I gave it all the code it had previously written to modify and make the final script.

modify this code to take an input from cli of user id. and fetch all of thier list of followers using extract user names and using a loop for pagination variable. the first part of pagination number becomes 0 xyz when we reach end of the list of followers. this the sample payload {"userld":"1556925600459268098""count":20,"cursor":"1761587287535449529|1648969075902119784",includePromotedContent":false} you'll have to mondiy the pagination number for each request.

aaaand It worked. On the first attempt nonetheless lmao.

It was truly wild seeing it understand the api response so well.

I am planning to create an llm agent that automates all of this by just giving it a url of a webpage and a prompt about what things to scrape... but first lemme fix my twitter recommendations real quick.

Danish Joshi

Making a better twitter recommendation engine. Part(1) Scraping Following's Followings.