My previous post gave you some working code to get you started extracting Google analytics data from the API. While visits and pageviews are important metrics they don’t mean much without context. In this post I’ll extend the code from the previous post to show the visits and pageviews for each referrer to a site. That’ll put some meat on these bones!
First, I’ll focus on just the creation of the query URI. In part 1, our query URL looked like:
query_uri = gdata.analytics.client.DataFeedQuery({
'ids': PROFILE_ID,
'start-date': sd,
'end-date': ed,
'dimensions': 'ga:date',
'metrics': 'ga:visits',
})
and had the output:
ga:date ga:visits
20111121 3214
20111122 2692
20111123 2360
20111124 1537
20111125 2227
20111126 2171
20111127 2220
A query asks the Google Analytics API for all metrics for each dimension specified. In this case the query asks for all “ga:visits” for each “ga:date” dimension which are recorded over the date range from “start-date” to “end-date” for the “ids” specified.
Here’s a new query to show visits and pageviews for all referrers each day inside the date range:
query_uri = gdata.analytics.client.DataFeedQuery({
'ids': PROFILE_ID,
'start-date': sd,
'end-date': ed,
'dimensions': 'ga:date,ga:source',
'metrics': 'ga:visits,ga:pageviews',
})
And its output looks like:
ga:date ga:source ga:visits ga:pageviews
20111121 (direct) 1029 7806
20111121 ask 8 44
20111121 austin360.com 1 12
20111121 bing 133 1130
20111121 en.wikipedia.org 2 40
20111121 facebook.com 2 9
20111121 google 1550 10642
20111121 google.com 6 27
20111121 home.myhughesnet.com 1 2
20111121 yellowbook.com 3 15
20111121 yellowpages.com 1 5
20111121 yelp.com 1 9
... many entries eliminated to protect the innocent
20111122 (direct) 844 4728
20111122 aol 5 11
20111122 ask 4 32
20111122 yahoo 124 1005
20111122 yellowbook.com 2 10
20111122 yellowpages.com 3 31
... many entries eliminated to protect the innocent
20111123 (direct) 793 4256
20111123 aol 9 62
20111123 ask 3 75
20111123 bing 118 867
20111123 yelp.com 1 7
... many entries eliminated to protect the innocent
20111124 (direct) 475 3526
20111124 aol 4 31
20111124 ask 5 37
20111124 bing 60 389
... many entries eliminated to protect the innocent
20111125 (direct) 676 4041
20111125 aol 6 42
20111125 ask 7 44
20111125 bing 99 588
20111125 yahoo 130 1152
20111125 yellowbook.com 3 14
... many entries eliminated to protect the innocent
20111126 (direct) 664 3524
20111126 aol 5 35
20111126 ask 9 34
20111126 bing 76 599
20111126 sxsw.com 7 17
20111126 yahoo 138 1002
20111126 yellowbook.com 5 18
20111126 yellowpages.com 1 19
20111126 yelp.com 2 48
... many entries eliminated to protect the innocent
20111127 (direct) 636 3507
20111127 aol 7 127
20111127 ask 7 67
20111127 bing 84 548
20111127 en.wikipedia.org 1 7
20111127 facebook.com 2 12
20111127 google.com 10 72
20111127 m.yp.com 2 5
20111127 mail.aol.com 1 4
20111127 yahoo 127 1074
20111127 yellowbook.com 3 10
20111127 yellowpages.com 1 1
20111127 yelp.com 2 38
Drop the date dimension and you get all referrers in your date range with no date grouping by the “ga:date” dimension. The query:
query_uri = gdata.analytics.client.DataFeedQuery({
'ids': PROFILE_ID,
'start-date': sd,
'end-date': ed,
'dimensions': 'ga:source',
'metrics': 'ga:visits,ga:pageviews',
})
It’s output:
ga:source ga:visits ga:pageviews
(direct) 5117 31388
aol 50 386
ask 43 333
bing 677 4705
bookmarks.yahoo.com 2 6
business.com 1 7
google 7937 53270
mws.ask.com 1 10
my.msn.com 1 41
my.yahoo.com 1 3
sxsw.com 33 89
yahoo 875 6789
yellowbook.com 25 104
yellowpages.com 7 68
yelp.com 6 102
... many entries eliminated to protect the innocent
Both of those outputs are valuable. First, the referrers, their referral volume, by date is quite valuable. I’d expect to use the second version to get an overview of aggregate referreral volume without the date segmentation to get a feel for a site’s referral health. I’d use the first version to find out when the referrals were delivered.
Here’s the full code for the Python program that produced the second output:
#!/usr/bin/python
'''
Use the Google Data python module to query Google Analytics
You'll get the "PROFILE_ID" from your Google Analytics account. From the default
listing you click on the account name and then take the "Edit" action on the website
profile you want to find the profile id for. It is listed near the top right under
"Profile Settings"
'''
__author__ = 'Leo Edmiston-Cyr '
import gdata.analytics.client
import datetime
USERNAME = 'yourAnalyticsAccount@gmail.com'
PASSWORD = 'youToughPassw0rd'
PROFILE_ID = 'ga:1234567' # the GA profile ID to query
SOURCE_APP_NAME = 'GAGettah' # anything you want to call it
sd = datetime.date(2011,11,21)
ed = datetime.date(2011,11,27)
#ed = datetime.date.today()
def main ():
my_client = gdata.analytics.client.AnalyticsClient(source=SOURCE_APP_NAME)
my_client.client_login(
USERNAME,
PASSWORD,
SOURCE_APP_NAME,
service='analytics')
query_uri = gdata.analytics.client.DataFeedQuery({
'ids': PROFILE_ID,
'start-date': sd,
'end-date': ed,
#'dimensions': 'ga:date,ga:source', # enable grouping by date
'dimensions': 'ga:source',
'metrics': 'ga:visits,ga:pageviews',
})
feed = my_client.GetDataFeed(query_uri)
# find out if this is the first run through the results
# to build a simple header for the dimensions and metrics
firstRun = True
heading = []
# we'll run through the data feed reutrned from our query
for entry in feed.entry:
# build each row of data from the feed
row = []
# pull all dimensions out of this entry
for dim in entry.dimension:
if firstRun:
heading.append(dim.name)
row.append(dim.value)
# pull all metrics out of this entry
for met in entry.metric:
if firstRun:
heading.append(met.name)
row.append(met.value)
# print the dimension and metric names as the header
if firstRun:
print "t".join(heading) + "r"
# print all rows from the feed as they are built
print "t".join(row) + "r"
# don't print the dimension and metric names as the header again
firstRun = False
if __name__ == '__main__':
main()
Want more dimensions and metrics? Visit the GA API dimensions and metrics reference page http://code.google.com/apis/analytics/docs/gdata/dimsmets/dimsmets.html
These articles were aimed at the technical web marketer who either writes a little code or has a programmer they want to egg on.