Home > Data Management > Importing Twitter data into Stata

Importing Twitter data into Stata

In the past, we’ve had users ask if Stata could import Twitter data. So we asked one of our interns, Dawson Deere (currently working on his computer science degree at Texas A&M University) to see if he could write a new command to do this. He used Stata 15’s improved Java plugins feature to write a new twitter2stata command. To install twitter2stata, type

ssc install twitter2stata, replace

Once installed, you can do the following

  • Import tweets based on a search string
  • . twitter2stata searchtweets "search_string"
    
  • Import user data using a search string
  • . twitter2stata searchusers "search_string"
    
  • Import a specfic user’s data
  • . twitter2stata getuser "userId_or_userName"
    
  • Import lists, likes, following, and followers user data
  • . twitter2stata likes "userId_or_userName"
    . twitter2stata following "userId_or_userName"
    . twitter2stata followers "userId_or_userName"
    . twitter2stata lists "userId_or_userName"
    
  • Import data about a specific list
  • . twitter2stata listusers "listId_or_listName"
    . twitter2stata listtweets "listId_or_listName"
    

The main purpose of this post is to show you how to get the twitter2stata command working in Stata. Below are the steps you must take.

  1. To use this command, you must have a Twitter account. If you don’t have one, you can create one here.

  2. Twitter limits the amount of data you can download. For the best rate limits, you must create a Twitter app. To do this, login to the Twitter website, and go to https://apps.twitter.com.

    You should see

    graph1

  3. Click on the Create New App button. You will see

    graph1

  4. There are three fields you must fill in: Name, Description, and Website. The name of the application must be unique. You might have to try a few times to find a unique Twitter application name. Once you have filled in the form, click on Create your Twitter Application. Next, click on the Key and Access Tokens tab.

    graph1

  5. Click on the Create my access token button to generate your access token and access token secret. You will see

    graph1

  6. You will need to copy the

    • Consumer Key (API Key)
    • Consumer Secret (API Secret)
    • Access Token
    • Access Token Secret

    and paste them into a do-file, for example,

    local consumer_key "xWNlx*N9vESv0ZZBtGdm7fVB"
    local consumer_secret "7D25oVzWeDCHrUlQcp9929@GOcnqWCuUKhDel"
    local access_token "74741598400768-3hAYpZbiDvABPizx5lk57B8CTVyfa"
    local access_token_secret "7HjDf25oVzDWAeDCHrUlQcpfNGOTzcnqWCuUKhDel"
    

    Be sure not to share these with anybody else.

In the same do-file, add the command

twitter2stata setaccess "`consumer_key'" "`consumer_secret'" ///
      "`access_token'" "`access_token_secret'"

to initialize these settings for twitter2stata. If you don’t use twitter2stata setaccess … before each twitter2stata session, you will recieve the error below.

. twitter2stata searchtweets "star wars", numtweets(10)
access token and access token secret not set.
Run twitter2stata setaccess to set your access token and access token secret.
r(198);

My do-file is now

local consumer_key "xWNlx*N9vESv0ZZBtGdm7fVB"
local consumer_secret "7D25oVzWeDCHrUlQcp9929@GOcnqWCuUKhDel"
local access_token "74741598400768-3hAYpZbiDvABPizx5lk57B8CTVyfa"
local access_token_secret "7HjDf25oVzDWAeDCHrUlQcpfNGOTzcnqWCuUKhDel"

twitter2stata setaccess "`consumer_key'" "`consumer_secret'" ///
     "`access_token'" "`access_token_secret'"
twitter2stata searchtweets "star wars", numtweets(10)
list user_screen_name user_follower_count user_friend_count, abbreviate(20)

When I run the do-file, I get

. twitter2stata searchtweets "star wars", numtweets(10)
(45 vars, 10 obs)

. list user_screen_name user_follower_count user_friend_count, abbreviate(20)

     +------------------------------------------------------------+
     | user_screen_name   user_follower_count   user_friend_count |
     |------------------------------------------------------------|
  1. |            024AB                  1297                1077 |
  2. |     StarWarsTime                  2040                1213 |
  3. |      LockerGnome                 24577                 976 |
  4. |           CMG_HD                     8                  35 |
  5. |   dilnyminic1986                     4                  30 |
     |------------------------------------------------------------|
  6. |     StarWarsTime                  2040                1213 |
  7. |   emimsohood1975                     3                   8 |
  8. |  Dan_NinjaRabbit                   712                2252 |
  9. |  KatelynLunsford                    13                  38 |
 10. |        PudseyMac                   335                 444 |
     +------------------------------------------------------------+

Again, there are limits to the amount of data Twitter will let you import. These limits are subcommand specific and limit the number of calls you can make to Twitter’s REST API every 15 minutes. Click here, to see a chart of all the data rate limits.

You can read the full details of twitter2stata‘s functionality in its help file after installing it. Dawson was able to write this command using Stata 15’s improved Java API together with Java library Twitter4J. In a later post, I will discuss how he went about developing this command and show you how easy it is to write Java code for Stata.

Categories: Data Management Tags: ,