March 5, 2010

The Mind of the Researcher — Daniel Russell (akla10)

Daniel Rus­sell, Google Search Qual­ity & User Hap­pi­ness
2010 Alaska Library Asso­ci­a­tion Con­fer­ence, open­ing keynote speaker

Lewis & Clark left with­out a decent map
it’s a com­pli­cated world out there and you don’t want to end up like the Don­ner Party (hey, go that way; it looks good)
what does the cur­rent infor­ma­tion map look like?
let’s be adven­tur­ers but keep our eyes and minds open

did a demo of Google Earth
cost to put the fly­over together = $0 and four min­utes of time
Google will crawl it within 48 hours
when Lewis & Clark pub­lished about their trip, it took 10 years
we see the world dif­fer­ently, and the library isn’t what it used to be
stacks are no longer a core com­pe­tence — the infor­ma­tion land­scape has rad­i­cally changed

1200 exabytes of new con­tent are gen­er­ated each year (1.2 yot­tabytes if that helps or 1.2 bil­lion terrabytes)
3.6 zetabytes per per­son per year (mostly music and video)
libraries don’t have to curate and man­age that — it stream to you
text words per pseron per year = .1% of that total
the good news is that the amount of read­ing per per­son per year has gone up by 3X since 1980 (pri­mar­ily due to inter­net access); hap­pen­ing online, not print
so need to develop new skills and new literacies

showed Google Books
can click on the places in a book and travel to all of them
can actu­ally reca­pit­u­late Huck Finn’s jour­ney down the river

LoC has 10 ter­abytes of text data or .01 petabytes
he has 2 LoCs at home
an exabyte = 50,000 years of DVD or 10 bil­lion copies of The Econ­o­mist (there aren’t enough trees in Alaska to print them all)

we’re sup­port­ing this renais­sance of access to print cul­ture at the same time we’re expand­ing online con­tent
1.5 mil­lion out of copy­right books that can be printed for $8 each

do you care about all of this as long as you can get to the stuff that you care about?
what Google is try­ing to fig­ure out is how can I read your mind from the cou­ple of words you gave me — which pages you want to see of theirs out of all of those exabytes of data?
it’s not just text anymore

men­tioned Hans Rosling’s TED talk about visu­al­iz­ing sta­tis­tics
men­tioned Baby Names Voy­ager
Google bought soft­ware to add visual sta­tis­tics to Google Docs
the cool part is I can type my name and see when my name peaked
is this a book? no. is it a visu­al­iza­tion? yes. but it’s also inter­ac­tive. where/how do I cat­a­log this?
these kinds of inter­ac­tive doc­u­ments allow you to under­stand in ways that were not pos­si­ble before
showed what hap­pened to names that begin with vow­els dur­ing the 40s and 50s — “the val­ley of the vow­els“
the answer to what hap­pened is in the hard con­so­nants
no one knew this until they could see it in this visu­al­iza­tion
our notion of what con­sti­tutes infor­ma­tion and librar­i­an­ship is changing

how do peo­ple search now?
sup­pose you’re Google and you get the query “jaguar” — what do they want?
one of the dif­fer­ences about being Google though is that you’re at a ref­er­ence desk where a bil­lion peo­ple a day ask the question

what about “iraq?” today, it’s the way; 15 years ago, it was prob­a­bly antiq­ui­ties
Google sees queries shift­ing a lot
“lat­est release Thinkpad dri­vers touch­pad” = I know exactly what they want
“ebay” = in the top 10 most pop­u­lar queries in Eng­lish per day
“google” is also in the top 10 queries per day — why?? are they try­ing to cause the recur­sive melt­down of Google’s servers?
there are 20,000 ways to mis-spell “Bri­tany Spears” (and they all want pic­tures of her)

one of the inter­est­ing things they do is use machine-generated algo­rithms
they don’t have to mis-spell a new celebri­ties name 20,000 times — their users will do that for them
that’s how infor­maiton works now

he goes to peo­ples’ homes a lot to talk to them and watch their behav­ior
showed a video clip of some­one search­ing at home for which celebrity has won the most Oscars
(she was pretty con­fused with the results she was get­ting — didn’t real­ize she had moved into the “Google News” sec­tion)
she has a grad­u­ate degree, runs her own web­site, and has her own tv show
the equiv­a­lent of watch­ing some­one look­ing at a text­book in the library and won­der­ing why she’s sud­denly look­ing at the news
this is why he has a job ;-)
he sees prob­lems in the world and tries to fix them

weekly sta­tis­tics:
3.9 vis­its per user
9.4 searches per user
11.2 search clicks per user
4 min­utes dura­tion
29% query refine­ment rate
they’re not spend­ing a lot of time in “the stacks”

66% of their users have less than one query per day
aver­age query length is less than 3 words
the “very con­fi­dent” peo­ple in a Pew study search mul­ti­ple times per day (34%)
suc­cess makes them search more often
92% feel con­fi­dent in their search­ing abil­ity
you don’t get good doing any­thing less than once per day (for four min­utes, no less)
55% call them­selves an “expert searcher” (despite how lit­tle I use the system)

they’re happy when they get a result from a search
peo­ple think of exper­tise as being socially-normed
“all of my friends say I’m the best searcher” — you want to say you’re good
peo­ple like to take on tasks they can suc­ceed at
showed an exam­ple where the dif­fer­ence in the ques­tion was “ghost town” vs “aban­doned city“
the “ghost town” peo­ple didn’t do well search­ing and were unhappy — took them a lot longer to find the infor­ma­tion
librar­i­ans are syn­onym pro­fes­sion­als
“func­tional fixed-ness” — being stuck on a search term, not being able to think of a synonym

Google is try­ing to con­vert peo­ple from the “ghost town” group to the “aban­doned city” group
they can see improve­ment over time

but the infor­ma­tion land­scape is so complex

Google launches about 10 prod­ucts per week, although more are invis­i­ble (tweaks to the algo­rithm, etc.)
but so far this year (and it’s only March 5), they’ve launched:
a really long list of things
these are all things that hap­pened to our infor­ma­tion land­scape in the last two months
new kinds of con­tent are com­ing online all the time
3D mod­els in SketchUp
“what’s a fly­ing but­tress? let me show” vs a 2D pic­ture in a Time-Life book

new kinds of query­ing infor­ma­tion
eg, Google Gog­gle — “Google, what’s that?“
“your cell­phone — it’s not just for typ­ing any­more“
“wait — when did cell­phones become stan­dard for typ­ing?“
tak­ing a pic­ture of a book gives you the meta­data about it (same for a bot­tle of wine, etc.)
you don’t have to type as much any­more
the way you inter­act with Google is changing

with Google Earth, if you fly to the Prado in Madrid, you can fly into the build­ing and even into one of the pic­tures; they’ll throw you out of the build­ing if you try that in Spain
get a level of detail you can’t see if you go there

Google Flu Trends
can tell when flu out­breaks are hap­pen­ing around the world by watch­ing for where queries are being made from
showed chart that illus­trates Alaska got it worse than other places and the out­break peaked in Octo­ber
any­one can run queries in Google Trends

how do you find Google Trans­la­tion Ser­vices? it’s not a book on a shelf
“when in doubt, search it out“
they’re work­ing rad­i­cally fast to change our world

Quan­tam ESP exper­i­ment
showed the old “psy­chic rab­bit” trick with play­ing cards
the point is that every­thing changes
you can’t pay atten­tion to every­thing
you’re smart — why didn’t you remem­ber all of the cards? because he told you to focus on one
there’s lots of stuff going on with your per­cep­tion and what you’re pay­ing atten­tion to

what have you noticed? what have you not noticed?
no one notices things like the lit­tle arrow that expands the map or lets you pan around the map and the “more” link
nobody sees these things — he has the logs to prove it
they’re focus­ing on what they’re try­ing to do
“per­cep­tual or change blind­ness“
showed the dif­fer­ence between a Google Map from 5 years ago ver­sus today
nobody noticed the results moved from the right side to the left
they change things all the time and nobody notices

how do we learn? how do we help our patrons learn?
it’s not like they’re ship­ping a new ver­sion of an OS — they’re chang­ing every­thing all the time, every day
and it’s not all nicely curated or indexed
that’s the growth rate we have to be think­ing about

how do we help our patrons“
of the 4 Rs, the fourth one is really “research“
in order to write com­pre­hen­sively and deeply, you need to do deep research
it’s not just look­ing up a call num­ber — that’s just the begin­ning
this is no longer optional — now the whole cul­ture has to under­stand this, not just librarians

analy­sis from 40 inter­views:
every­body knows what a query is, what a result is
but no one knows what “search on page” and “search in results” mean
it’s not helped by click­bombs like the “mis­er­able fail­ure” search results
if you’re not on the inside with a mech­a­nism to under­stand how this stuff works, you think Google is mon­key­ing with the sys­tem, even though they aren’t; some­one else is
most peo­ple don’t under­stand “clas­sic search engine opti­miza­tion“
makes it impos­si­ble to have a coher­ent men­tal model for how the web works

with­out a detailed model, we’re “cargo cultists” (New Guinea)
when some­one tells you to reboot the router to get wire­less back, you’re a cargo cultist
“never click up there”

I dunno how it works. I just type words, and answers come back to me… about any­thing… any­thing at all…” — stu­dent
within his realm, he was a good searcher
devel­oped vocab­u­lary and domain knowl­edge around expen­sive watches but can’t find the cap­i­tal of Alaska

when you’re in West­Law, you have to know how to make the oper­a­tors work
in Google, you have to know how to come up with good search terms

6 kinds of knowl­edge & skills needed to search:
– pure engine tech­nique (choos­ing good terms, dou­ble quotes, etc.)
– infor­ma­tion map­ping (reverse dic­tio­nary, con­tents of domains, Wikipedia, etc.)
– domain knowl­edge (med­ical knowl­edge, plumb­ing knowl­edge, etc.)
– search strat­egy (know­ing when to shift strate­gies, move from wide to nar­row, pre­serv­ing state, etc.)
– assess­ment (how do you assess the cred­i­bil­ity of a resource? a lot of this is tied up in domain knowl­edge, which 16-year olds don’t have)
– site-specific knowl­edge (know­ing how a site works, is laid out, etc.)

basic skills:
– Control-F to find
– tabs (how to use effec­tively to orga­nize search)
– key­word query choice (effec­tive choices; low/high fre­quen­cies terms)
– tac­tics (when to focus on par­tic­u­lar resource)
– strate­gies (how long to pur­sue a tac­tic; when to switch; how to dis­cover)
– under­stand­ing what you find (read­ing for under­stand­ing SERPs; not “overreading”)

teach­ing research skills
– want peo­ple to under­stand the world and do research so they under­stand the world
– not just web search skills
– author­ity assess­ment
– crap detec­tion
– stay­ing on task
– dis­cov­ery
– note­tak­ing
– data inte­gra­tion
– rep­re­sen­ta­tion construction

find­ings:
1 — very uneven indi­vid­ual level of search skill (every­one showed at least one “deep” skill; every­one showed at least one mis­taken under­stand­ing; 90% wished they knew how to search bet­ter, but only 10% will take a class)
search behav­ior pat­terns
users don’t know the names of parts or rec­og­nize them (includ­ing URL, site, query; it’s hard to search for things you can’t name; don’t want to click on that because it might bring up porn)

2 — com­fort level is VERY impor­tant
users choose famil­iar over scary
peo­ple tend not to explore things they dn’t know
they worry about find­ing porn
they worry about hav­ing unkonwn things hap­pen when they click on strange links
– edu­ca­tion is acci­den­tal
– peo­ple are not good reporters of their own behav­ior (“I don’t have a tool­bar; I don’t do image search”)

3 — peo­ple don’t know much about Google as a whole (an oppor­tu­nity for librar­i­ans)
they don’t know what’s pos­si­ble
a CTO who didn’t know how to find Google Maps to find a pub in Palo Alto
a PhD cog­ni­tive psy­chol­o­gist didn’t know about Google Scholar
– tar­get site knowl­edge is critical

where do we go next?
– there is a big, big, big need for help — it’s not all intu­itive; they can’t yet do mind-reading
– huge range of men­tal mod­els among users
– users, for the most part, have lit­tle idea what’s pos­si­ble in web search or how to use it effec­tively
they’re learn­ing acci­den­tally from peers or from librar­i­ans
we’re look­ing at an information-illiterate pop­u­la­tion
no one else is show­ing them

- show them the shape of the infor­ma­tion land­scape
– teach your patrons
– make time to con­tin­u­ally edu­cate your­self (you’re now enrolled in a per­ma­nent edu­ca­tion process; if you miss it for a cou­ple of years, good luck catch­ing up)

every­thing is shift­ing and mov­ing faster, so make time for con­tin­ual self-improvement
“be the Lewis, be the Clark” — com­mu­ni­cate this stuff to our patrons
be the core of dis­cov­ery for patrons

Share:
  • Digg
  • del.icio.us
  • StumbleUpon
  • Reddit
  • Facebook
  • LinkedIn
  • Ping.fm
  • Tumblr
  • Diigo
  • email
  • FriendFeed
  • PDF
  • Posterous
  • Twitter

1:23 pm Comments (0)

No Comments »

No comments yet.

RSS feed for comments on this post. | TrackBack URI

Leave a comment