Introduction
Facebook, Instagram and Twitter are considered as one of the top social media platforms. According to BusinessofApps Facebook, Instagram and Twitter have 2700, 1160 and 330 million estimated active users every month respectively. By looking at these numbers we can say that social media has become one of the largest sources of data. Though Facebook and Instagram have more active users, Twitter remains the most popular platform for academic researchers and developers. The main reason for this could be the availability and insights of data provided by Twitter.
As of 2018, according to Oberlo, 500 million tweets are sent each day. That equates to 5,787 tweets per second. In the beginning, Twitter was used as just a textual platform but with growing popularity, it is also being used for sharing photos and videos. Apart from sharing their thoughts and ideas, many people use Twitter as a tutorial and community platform.
Twitter Developer Account
Before using Twitter API, we need to set up a developer account and need to create a project to get API keys. If you do not have a Twitter account you need to create one and then go to Twitter Developers Dahsboard and sign in. You will see the dashboard as shown in the image below.
Click on the Create Project button it will take you to the screen as below. Gave name to your project. You can give any name but it should be unique. I gave MyBlogProject as a name.
After that select the reason for this project and write down some descriptions about the project. (I guess, it does not necessary but they are asking these anyway.)
In the end, you just need to give the name of the application. Again it does not matter but it should be unique. And once you press complete you will get your API keys as shown in the image.
You can copy these keys now are you can get them afterwards as well. Do not share your keys with anyone. Now if you click on Projects and Apps and then overview from the left side panel. You will see all your projects and standalone apps. You can use both or any one of Project or Standalone Apps. For this tutorial, we will use App which we created inside Project. You can see I have both apps inside the project and a standalone app. Earlier there was only one kind of Twitter API which was V1.1 but at the time I was writing a blog new V2 was in the early access stage. So for the project, I have access to both versions whereas for a standalone app I had access to only V1.1.
Exploration of V1.1
Every endpoint in V1.1 API starts with https://api.twitter.com/1.1/search/tweets.json. We can use different query parameters as per our needs. In this tutorial, we are using Standard API which puts some restrictions on usage. As per official documentation, there are many kinds of query parameters available which are:
- q
- geocode
- lang
- result_type
- count
- include_entities
- until
- since_id and max_id
- locale (only
ja
is currently effective, so we will avoid this one.) - tweet_mode (this has not been documented in official docs.)
From the above-mentioned parameters, only 'q' is the required query parameter and all others are optional parameters.
1. Query Parameter: "q"
Query parameter q is used to search specific terms, hashtags and users. As an example, we can use #python
to get all tweets that contain #python or we can use "web development"
as a single word to retrieve all tweets that have "web development" in it. Moreover, by using @elonmusk
, we can get all tweets and retweets of user Elon Musk.
Note: Standard API will get you only data from the last 7 days and if you do not use count it will return only 15 data.
Example of fetching tweets with hashtag: python
https://api.twitter.com/1.1/search/tweets.json?q=%23python
Example of fetching tweets that contain: "web development"
https://api.twitter.com/1.1/search/tweets.json?q="web development"
Example of fetching tweets with username: elonmusk
https://api.twitter.com/1.1/search/tweets.json?q=@elonmusk
Here, in the first example, as you can see we have used %23 instead of symbol #. We need to use such representation as we can not use some symbols directly in our URI. By default, Twitter API gives truncated tweet data. So, to get the full tweets, we have to pass tweet_mode=extended query parameter.
2. Query Parameter: "geocode"
Once we have decided what to search and which hashtag, keywords or user to search, we can search tweets for specific geolocation using the geocode parameter. This query parameter is optional_, so, one can use it as per needs.
Now, If I want to search the keyword python around Bangalore, I need to pass latitude, longitude and radius. So, I will pass query parameter as geocode=12.97194,77.59369,1mi. Here, 12.97194,77.59369 and 1mi are latitude, longitude and radius respectively. Where 1mi means 1 mile. We can use km (kilometres) as well.
To get tweets that have the python keyword and which are within a mile radius of Bangalore one can use the following URI:
https://api.twitter.com/1.1/search/tweets.json?q=python&geocode=12.97194,77.59369,1mi
3. Query Parameter: "lang"
We can get tweets with specific language. As an example, we can get tweets that have the keyword India and language is hindi. To fetch this kind of data we can use the lang query parameter as shown below.
https://api.twitter.com/1.1/search/tweets.json?q=India&lang=hi
One can use the following Wikipedia page to get a list of all languages and their respective codes.
4. Query Parameter: "result_type"
There is some situation where we want most recent tweets and sometimes there are situations where we want most popular tweets. In this kind of situation, we can use query parameter result_type. This query parameter is optional as geocode.
We can either use mixed, recent or popular as reasult_type where mixed is the default value.
5. Query Parameter: "count"
The count parameter is used when we want a specific number of tweets. By default, we will get 15 tweets but in a single request we can get up to a maximum of 100 tweets.
6. Query Parameter: "include_entities"
One can get some extra data by passing include_entities query parameter as true.
By use of the URI shown below it gives me extra entities field data.
https://api.twitter.com/1.1/search/tweets.json?q=%22web%20development%22&include_entities=true&count=1
Entities Data:
"entities": {
"hashtags": [
{
"text": "FREECOURSE",
"indices": [
0,
11
]
},
{
"text": "FREE",
"indices": [
88,
93
]
},
{
"text": "online",
"indices": [
94,
101
]
},
{
"text": "udemy",
"indices": [
102,
108
]
}
],
"symbols": [],
"user_mentions": [],
"urls": [
{
"url": "https://t.co/YcPoOiGnNu",
"expanded_url": "https://www.udemy.com/course/bootstrap-3-responsive-design-tutorial-fundamentals/?couponCode=DISCOVERYVIP",
"display_url": "udemy.com/course/bootstr…",
"indices": [
64,
87
]
},
{
"url": "https://t.co/icOzd6jBUe",
"expanded_url": "https://twitter.com/i/web/status/1376231256258719747",
"display_url": "twitter.com/i/web/status/1…",
"indices": [
110,
133
]
}
]
}
One thing to notice here is, I have used %22 instead of double quotes (") and %20 instead of white space.
7. Query Parameter: "until"
To get all tweets created before a specific date, we can use until query parameter. The date should be formatted as YYYY-MM-DD. Keep in mind that the search index has a 7-day limit (for Standard API). In other words, no tweets will be found for a date older than one week.
When I was writing this blog it was 2021-03-28
, so, I can request data up to 2021-03-22
. If I request the date 2021-03-21
, it will give me an empty array. One can use the following format to use the until parameter:
https://api.twitter.com/1.1/search/tweets.json?q=python&until=2021-03-22
8. Query Parameter: "since_id" and "max_id"
Truly, in documentation, I could not find how can we get/generate since_id and max_Id.
As per the documentation, if we use since_id, it will return results with an ID greater than (that is, more recent than) the specified ID. On other hand, max_id will return results with an ID less than (that is, older than) or equal to the specified ID.
Boolean Syntax
We can use boolean operators and grouping mechanisms to get more specific tweets. We can use logical And, OR and NOT(-) operators. We can use _round parenthesis for grouping multiple keywords and filters.
If we want to search tweets that contain python and developers then we can write q=python%20developer
. This will search tweets with both keywords python and developer. Here, note that python and the developer don't need to come together.
If we want them together we can write q=%22python%20developer%22
or q="python developer"
. If we want tweets with either python or developer then we can write q=(python OR developer)
.
If we want to ignore tweets with some keywords we can do using hyphen(-). So, to get tweets with the keyword python or Django and ignore developer, we can write a query like q=(python OR Django) -developer
.
Note: We can use multiple OR and negation in our query. To use multiple negations instead of using -(iPhone OR iMac OR MacBook)
, use the following: -iPhone -iMac -MacBook
.
There can be some uncertainty while using multiple operations. Example:
-
apple OR iPhone iPad
would be evaluated asapple OR (iPhone iPad)
-
iPad iPhone OR android
would be evaluated as(iPhone iPad) OR android
To eliminate uncertainty and ensure that your rules are evaluated as intended, group terms together with parentheses where appropriate. For example:
(apple OR iPhone) iPad
iPhone (iPad OR android)
More Filters
As per other online resources and official documentation, we can filter data by replies, retweets and based on whether the account is verified or not as well.
No | Filter | Explanation |
---|---|---|
1 | filter:retweets | Includes retweets |
2 | -filter:retweets | Excludes retweets |
3 | filter:replies | Includes replies |
4 | -filter:replies | Excludes replies |
5 | filter:verified | Includes tweets from verified accounts only |
6 | -filter:verified | Excludes tweets from verified accounts only |
7 | exclude:retweets | Excludes retweets |
8 | exclude:replies | Excludes replies |
9 | since:YYYY-MM-DD | fetch tweets since mentioned date |
10 | until:YYYY-MM-DD | fetch tweets until mentioned date |
To use these filters, we need logical AND and OR operators. It does the same as the name suggests. To understand more, we can go through some examples.
- Get tweets with the keyword python and exclude retweets:
https://api.twitter.com/1.1/search/tweets.json?q=python AND -filter:retweets
orhttps://api.twitter.com/1.1/search/tweets.json?q=python AND exclude:retweets
- Get tweets with the keyword python and exclude replies:
https://api.twitter.com/1.1/search/tweets.json?q=python AND -filter:replies
orhttps://api.twitter.com/1.1/search/tweets.json?q=python AND exclude:replies
- Get tweets with the keyword python and exclude retweets and replies:
https://api.twitter.com/1.1/search/tweets.json?q=python AND -filter:retweets AND -filter:replies
- Get tweets with the keyword "apple iPad" and from a verified account:
https://api.twitter.com/1.1/search/tweets.json?q="apple iPad" AND filter:verified
Some more filers are:
No | Filter | Explanation |
---|---|---|
1 | filter:links | Includes tweets with links |
2 | -filter:links | Excludes tweets with links |
3 | filter:images | Includes tweets with images |
4 | -filter:images | Excludes tweets with images |
5 | filter:videos | Includes tweets with videos |
6 | -filter:videos | Excludes tweets with videos |
7 | from:user | Brings back tweets from named user |
8 | to:user | Brings back tweets sent to named user |
9 | -has:hashtags | Includes tweets without hashtags |
10 | has:hashtags | Includes tweets with hashtags (for some reason it is not working now on) |
There are many other filters are there it is not a good idea to mention them all here. You can find all other filters in official documentation.
Note: Here write username without @
One last example:
Let's assume we want to get recent 50 tweets with keywords "python developer" AND Django and by ignoring flask. We want only tweets and replies (in other words we don't want retweets). We need full tweets with English(en) language around the Bangalore area and we do not want extra entities.
Solution:
https://api.twitter.com/1.1/search/tweets.json?q=("python developer" AND django) -flask AND exclude:retweets&tweet_mode=extended&lang=en&count=50&geocode=12.97194,77.59369,10mi
To eliminate uncertainty and ensure that your rules are evaluated as intended, group
Conclusion
I have tried to explain all the methods and filters one can use for fetching tweets using Twitter API. I wrote this blog for introducing all developers to standard Twitter API, there might be many other query parameters and filters that are not mentioned here.
References
I am grateful for all the resources mentioned below. These are the all resources that helped me in writing this blog.
Top comments (0)