Daniel Russell, Google Search Quality & User Happiness
2010 Alaska Library Association Conference, opening keynote speaker
Lewis & Clark left without a decent map
it’s a complicated world out there and you don’t want to end up like the Donner Party (hey, go that way; it looks good)
what does the current information map look like?
let’s be adventurers but keep our eyes and minds open
did a demo of Google Earth
cost to put the flyover together = $0 and four minutes of time
Google will crawl it within 48 hours
when Lewis & Clark published about their trip, it took 10 years
we see the world differently, and the library isn’t what it used to be
stacks are no longer a core competence – the information landscape has radically changed
1200 exabytes of new content are generated each year (1.2 yottabytes if that helps or 1.2 billion terrabytes)
3.6 zetabytes per person per year (mostly music and video)
libraries don’t have to curate and manage that – it stream to you
text words per pseron per year = .1% of that total
the good news is that the amount of reading per person per year has gone up by 3X since 1980 (primarily due to internet access); happening online, not print
so need to develop new skills and new literacies
showed Google Books
can click on the places in a book and travel to all of them
can actually recapitulate Huck Finn’s journey down the river
LoC has 10 terabytes of text data or .01 petabytes
he has 2 LoCs at home
an exabyte = 50,000 years of DVD or 10 billion copies of The Economist (there aren’t enough trees in Alaska to print them all)
we’re supporting this renaissance of access to print culture at the same time we’re expanding online content
1.5 million out of copyright books that can be printed for $8 each
do you care about all of this as long as you can get to the stuff that you care about?
what Google is trying to figure out is how can I read your mind from the couple of words you gave me – which pages you want to see of theirs out of all of those exabytes of data?
it’s not just text anymore
mentioned Hans Rosling’s TED talk about visualizing statistics
mentioned Baby Names Voyager
Google bought software to add visual statistics to Google Docs
the cool part is I can type my name and see when my name peaked
is this a book? no. is it a visualization? yes. but it’s also interactive. where/how do I catalog this?
these kinds of interactive documents allow you to understand in ways that were not possible before
showed what happened to names that begin with vowels during the 40s and 50s – “the valley of the vowels”
the answer to what happened is in the hard consonants
no one knew this until they could see it in this visualization
our notion of what constitutes information and librarianship is changing
how do people search now?
suppose you’re Google and you get the query “jaguar” – what do they want?
one of the differences about being Google though is that you’re at a reference desk where a billion people a day ask the question
what about “iraq?” today, it’s the way; 15 years ago, it was probably antiquities
Google sees queries shifting a lot
“latest release Thinkpad drivers touchpad” = I know exactly what they want
“ebay” = in the top 10 most popular queries in English per day
“google” is also in the top 10 queries per day – why?? are they trying to cause the recursive meltdown of Google’s servers?
there are 20,000 ways to mis-spell “Britany Spears” (and they all want pictures of her)
one of the interesting things they do is use machine-generated algorithms
they don’t have to mis-spell a new celebrities name 20,000 times – their users will do that for them
that’s how informaiton works now
he goes to peoples’ homes a lot to talk to them and watch their behavior
showed a video clip of someone searching at home for which celebrity has won the most Oscars
(she was pretty confused with the results she was getting – didn’t realize she had moved into the “Google News” section)
she has a graduate degree, runs her own website, and has her own tv show
the equivalent of watching someone looking at a textbook in the library and wondering why she’s suddenly looking at the news
this is why he has a job 😉
he sees problems in the world and tries to fix them
weekly statistics:
3.9 visits per user
9.4 searches per user
11.2 search clicks per user
4 minutes duration
29% query refinement rate
they’re not spending a lot of time in “the stacks”
66% of their users have less than one query per day
average query length is less than 3 words
the “very confident” people in a Pew study search multiple times per day (34%)
success makes them search more often
92% feel confident in their searching ability
you don’t get good doing anything less than once per day (for four minutes, no less)
55% call themselves an “expert searcher” (despite how little I use the system)
they’re happy when they get a result from a search
people think of expertise as being socially-normed
“all of my friends say I’m the best searcher” – you want to say you’re good
people like to take on tasks they can succeed at
showed an example where the difference in the question was “ghost town” vs “abandoned city”
the “ghost town” people didn’t do well searching and were unhappy – took them a lot longer to find the information
librarians are synonym professionals
“functional fixed-ness” – being stuck on a search term, not being able to think of a synonym
Google is trying to convert people from the “ghost town” group to the “abandoned city” group
they can see improvement over time
but the information landscape is so complex
Google launches about 10 products per week, although more are invisible (tweaks to the algorithm, etc.)
but so far this year (and it’s only March 5), they’ve launched:
a really long list of things
these are all things that happened to our information landscape in the last two months
new kinds of content are coming online all the time
3D models in SketchUp
“what’s a flying buttress? let me show” vs a 2D picture in a Time-Life book
new kinds of querying information
eg, Google Goggle – “Google, what’s that?”
“your cellphone – it’s not just for typing anymore”
“wait – when did cellphones become standard for typing?”
taking a picture of a book gives you the metadata about it (same for a bottle of wine, etc.)
you don’t have to type as much anymore
the way you interact with Google is changing
with Google Earth, if you fly to the Prado in Madrid, you can fly into the building and even into one of the pictures; they’ll throw you out of the building if you try that in Spain
get a level of detail you can’t see if you go there
Google Flu Trends
can tell when flu outbreaks are happening around the world by watching for where queries are being made from
showed chart that illustrates Alaska got it worse than other places and the outbreak peaked in October
anyone can run queries in Google Trends
how do you find Google Translation Services? it’s not a book on a shelf
“when in doubt, search it out”
they’re working radically fast to change our world
Quantam ESP experiment
showed the old “psychic rabbit” trick with playing cards
the point is that everything changes
you can’t pay attention to everything
you’re smart – why didn’t you remember all of the cards? because he told you to focus on one
there’s lots of stuff going on with your perception and what you’re paying attention to
what have you noticed? what have you not noticed?
no one notices things like the little arrow that expands the map or lets you pan around the map and the “more” link
nobody sees these things – he has the logs to prove it
they’re focusing on what they’re trying to do
“perceptual or change blindness”
showed the difference between a Google Map from 5 years ago versus today
nobody noticed the results moved from the right side to the left
they change things all the time and nobody notices
how do we learn? how do we help our patrons learn?
it’s not like they’re shipping a new version of an OS – they’re changing everything all the time, every day
and it’s not all nicely curated or indexed
that’s the growth rate we have to be thinking about
“how do we help our patrons”
of the 4 Rs, the fourth one is really “research”
in order to write comprehensively and deeply, you need to do deep research
it’s not just looking up a call number – that’s just the beginning
this is no longer optional – now the whole culture has to understand this, not just librarians
analysis from 40 interviews:
everybody knows what a query is, what a result is
but no one knows what “search on page” and “search in results” mean
it’s not helped by clickbombs like the “miserable failure” search results
if you’re not on the inside with a mechanism to understand how this stuff works, you think Google is monkeying with the system, even though they aren’t; someone else is
most people don’t understand “classic search engine optimization”
makes it impossible to have a coherent mental model for how the web works
without a detailed model, we’re “cargo cultists” (New Guinea)
when someone tells you to reboot the router to get wireless back, you’re a cargo cultist
“never click up there”
“I dunno how it works. I just type words, and answers come back to me… about anything… anything at all…” – student
within his realm, he was a good searcher
developed vocabulary and domain knowledge around expensive watches but can’t find the capital of Alaska
when you’re in WestLaw, you have to know how to make the operators work
in Google, you have to know how to come up with good search terms
6 kinds of knowledge & skills needed to search:
– pure engine technique (choosing good terms, double quotes, etc.)
– information mapping (reverse dictionary, contents of domains, Wikipedia, etc.)
– domain knowledge (medical knowledge, plumbing knowledge, etc.)
– search strategy (knowing when to shift strategies, move from wide to narrow, preserving state, etc.)
– assessment (how do you assess the credibility of a resource? a lot of this is tied up in domain knowledge, which 16-year olds don’t have)
– site-specific knowledge (knowing how a site works, is laid out, etc.)
basic skills:
– Control-F to find
– tabs (how to use effectively to organize search)
– keyword query choice (effective choices; low/high frequencies terms)
– tactics (when to focus on particular resource)
– strategies (how long to pursue a tactic; when to switch; how to discover)
– understanding what you find (reading for understanding SERPs; not “overreading”)
teaching research skills
– want people to understand the world and do research so they understand the world
– not just web search skills
– authority assessment
– crap detection
– staying on task
– discovery
– notetaking
– data integration
– representation construction
findings:
1 – very uneven individual level of search skill (everyone showed at least one “deep” skill; everyone showed at least one mistaken understanding; 90% wished they knew how to search better, but only 10% will take a class)
search behavior patterns
users don’t know the names of parts or recognize them (including URL, site, query; it’s hard to search for things you can’t name; don’t want to click on that because it might bring up porn)
2 – comfort level is VERY important
users choose familiar over scary
people tend not to explore things they dn’t know
they worry about finding porn
they worry about having unkonwn things happen when they click on strange links
– education is accidental
– people are not good reporters of their own behavior (“I don’t have a toolbar; I don’t do image search”)
3 – people don’t know much about Google as a whole (an opportunity for librarians)
they don’t know what’s possible
a CTO who didn’t know how to find Google Maps to find a pub in Palo Alto
a PhD cognitive psychologist didn’t know about Google Scholar
– target site knowledge is critical
where do we go next?
– there is a big, big, big need for help – it’s not all intuitive; they can’t yet do mind-reading
– huge range of mental models among users
– users, for the most part, have little idea what’s possible in web search or how to use it effectively
they’re learning accidentally from peers or from librarians
we’re looking at an information-illiterate population
no one else is showing them
– show them the shape of the information landscape
– teach your patrons
– make time to continually educate yourself (you’re now enrolled in a permanent education process; if you miss it for a couple of years, good luck catching up)
everything is shifting and moving faster, so make time for continual self-improvement
“be the Lewis, be the Clark” – communicate this stuff to our patrons
be the core of discovery for patrons
March 5, 2010
The Mind of the Researcher – Daniel Russell (akla10)
Comments Off on The Mind of the Researcher – Daniel Russell (akla10)