Monday, April 27, 2015

Apple #709: Behind the Daily Apple -- Doing the Research

This is part two in a series of entries that attempt to answer the question, how does the Apple Lady do what she does?  (For all you dirty-minded folks out there, that means how do I assemble these here Daily Apples.)

The first part talked about how to construct a search query that yields targeted results.  The next step in what I do is to read what I find, to follow what information I find -- which can sometimes surprise me -- and then organize it in a coherent way and type it up.

Mahalia (the faithful Apple reader who requested this), I've been thinking about this for a week, and I'm still a bit stumped as to how to represent this process for you in a way that will be neither so detailed as to be tedious nor so glossed-over as to be useless.  I might like to have two columns of text, one where I type up what would be the Daily Apple entry, and another beside it where I comment on how I arrived at thus & so.  But Blogger doesn't allow for any two-column shenanigans.  So I'll have to come up with another method.


You might say we're getting to the heart of the Daily Apple in this entry. (The apple, by the way, is a Pink Lady.)
(Photo from Chauncey's in the UK)

The Question

First things first. Every entry starts with the question at hand.  If I don't have a question from a faithful or interested reader, I come up with my own.  It's usually some oddment or other that I run into either in conversation during the week, or occasionally it's something in the news that I don't understand entirely, or sometimes there is no unanswered question nagging me -- or at least, not one that comes to mind when I sit down to assemble one of these here Daily Apples -- so I try to think of something.  These are usually my criteria:
  • A question whose answer I don't already know
  • The topic is something everyday, nothing arcane like how nanospheres work, but rather something that most of us encounter or could encounter in our daily lives
  • I try to keep the question focused. Nothing so general as "tell me everything about tigers." Much as I like tigers and do want to know as much about them as possible, I've learned that if my topic is too general, I find myself reproducing encyclopedia entries. I'd rather choose a topic that is a little more focused, like "what noises do tigers make" because I think I'd have a better chance of telling you something you don't already know, and also because this is a way to try to keep my entries not so long that no one will read them.  I have a tendency to go long.

For this meta-Apple, I am going to violate one of my criteria and choose a question whose answer I already know: what's up with the biting ladybugs?  To get a bit more specific, where did they come from?  Did they always bite, or is this a new thing?

I'm choosing a known topic to give myself a better shot at showing you how I do what I do.  No surprises in the searching, no need to completely re-work the organization of the entry, or other apple-tastrophes like that.

Once I've stated the question, I usually put a picture just beneath it.  I try to find a picture that typifies the entry as a whole, in case that picture becomes the thumbnail associated with the entry.  I want the image to be something colorful or engaging, something that makes you want to click on or read the entry.

In this case, I could show you a picture of swarming or biting ladybugs, but I do not want to give you the itchies right off the bat.  Instead, I'll just show you a lot of ladybugs.  Or maybe just one.



Asian ladybug. This kind bites.
(Photo from GardenWeb)


The process of how I find the images and how I link to them -- all that I'll cover in another entry about images.

The Search

The next thing that happens is I go do a lot of Googling.  How I do the Googling, how I build a search query I discussed in the previous entry, so I won't go through all that again.

What I'm going to show you instead is what I look for when I've done the search query, and how I proceed through the results I get.



Today's results of a Google search for "biting ladybugs"


So I started out with a pretty unprofessional search query, just typing in biting ladybugs.  No quotation marks or anything.  This was my first search on the topic, and I just wanted to throw something out there and see what came back, see if I needed to refine my search at all, and in what way.

I also wanted to get a sense of the general public's perception about ladybugs.  Do lots of people know that ladybugs bite?  Do some people think that they don't?

Sometimes the mis-information about a topic can tell you something pretty interesting about the topic itself, so much so that the mis-information itself can become the interesting to talk about.  Like, for example, if I decided to do a Daily Apple on the roundness of the Earth, I would absolutely have to talk about the fact that some people persist in believing that the Earth is flat. [insert gif of someone shuddering here]

But in this case, it looks like lots of people do know that ladybugs bite, though they seem to be rather stunned by this, or unable to believe that something as apparently innocuous as a ladybug would ever bite someone.  I'm gathering this from the questions that have been posed:  "Do Ladybugs (the garden variety ladybug we all know and love) bite? My mother insists that she received a nasty bite from a ladybug" and "I looked up whether lady bugs bite after being bitten by one today."

So I might want to begin the answer part of my entry with something like this:

  • Yes, it's true.  The ladybugs that you so loved when you were small, the ones whose cutout shapes decorated the walls of your pre-school and kindergarten classrooms, do bite.  
  • Or at least, one species that lives here now does.
(I'm getting ahead of myself here.  I'm revealing part of the answer that I already know.)

The Sources


Let's look at the results of my search again, more closely.  There are several entries from "homemade" sites like mine -- regular Joe or Jane Schmos typing up their experiences with ladybugs.  Their experiences might be very interesting, or their research could be very reliable.  Heck, your Daily Apple is a "homemade" site.  But I wouldn't choose these as the FIRST site I check on a topic.  If there were no other good results, I might go to a homemade site first, but then I would look for confirming information from some other, more traditionally reliable sites.

Another place I would put lower on my list is the hit from Quora.com, which is a site where people post questions they want answered, and then they vote on what they think is the best answer--sort of like Answers.com, or Ask Yahoo, or those other public sites where anyone can post an answer, and anyone can decide it's a good one.

And while we're on the subject of less-favorable sources, let's talk about Wikipedia.  Wikipedia can be a great place TO START.  Oftentimes there's stuff in a Wikipedia entry that I didn't know, and that's either because I'm not fully informed, or else it's because somebody made up some crap or didn't cite their sources and I can't verify it.



Wikipedia.  Nice place to visit, but you don't want to live there.
(Image from Wikipedia Commons)


So if I use Wikipedia at all, I consider it a jumping-off point.  If I do look at it, I try to find at least two other sources -- yes, two -- that back up what's on Wikipedia.  If I find some tidbit of information on Wikipedia first, I try to follow it to its source and then I look for at least one other page or site that discusses that tidbit.  Usually when I do that, I discover qualifications, some additional detail that reveals something was glossed over in the Wikipedia entry, or explained badly or incompletely.  This is another reason I double- and triple-check information I find on Wikipedia.

Only when I feel like I've got a solid answer that I can explain clearly to myself as well as to you, and that I've used sources I feel confident about recommending do I post something.

Your Apple Lady does not want to be telling you a bunch of lies & made-up junk, that's for sure.

In this case, the keyword snapshot that pops up the the Wikipedia link sounds pretty interesting:  "Sometimes, the beetles will bite humans, presumably in an attempt to acquire salt. . . "  Well, that would be interesting to know, wouldn't it?  WHY ladybugs bite?  That's sort of on the order of what do women want, in the insect world, isn't it?

But this is a little bit farther down the line than we are right now.  We still need to talk more about the fact that ladybugs do bite.  This is another thing I want to point out.  I don't like to give you only yes/no answers.  I like to give you context.  Detail.  Background.  The bigger picture.  So you don't just know that ladybugs bite, you know which ones, in what circumstances, maybe in what parts of the country or what parts of the world, and so on.  You're more likely to remember the yes/no answer if you've got more parts of that bigger picture in your mind.  And, frankly, I like to know the details.  Hate to break it to you, but I'm looking this stuff up as much for me as for you.



Screenshot of Creature Control's page on the Asian Lady Beetle


OK, the first link I clicked on was the one to Creature Control's site.  It was second on the Google results list, after the one that sounded homemade.  Usually Google puts at the top pages that are reliable or authoritative or that are really strongly focused on the topic at hand.  So I trusted Google to give me better results at the top.

This is information from a commercial site.  Meaning, Creature Control is a business -- in this case, a pest exterminating business, like Orkin -- and they are posting information about one of the pests they exterminate.  These are folks who deal with bugs on a daily basis, so they ought to know about these bugs.

However, sometimes these sorts of things are typed up by people who are not so good at the proofreading, or people whose knowledge is so pigeonholed as to be incomplete, or who round out their information with rumor or wild guesses or other kind of slipshod information.  So I've learned to regard pages like these as a pretty good place to start, but another type of source to be verified.

Here is Creature Control's introductory paragraph:

The Asian lady beetle (not to be confused with the indigenous American ladybug) is an invasive species of the Coccinellidae family introduced into the United States in 1988 for the purpose of reducing native aphid populations. Since 1988, they have spread throughout North America, in most places displacing the native ladybug populations to become the dominant Coccinellidae beetle. Because of their destruction of plant life and their aggressive tendency to bite, Asianlady beetles are commonly considered a nuisance pest. [Creature Control, Asian Lady Beetle]

You can see what I mean about the proofreading -- sometimes "Asian lady" is one word, sometimes two, Coccinellidae is sometimes italicized, sometimes not.  The reason I point this out is not to be annoying (well, typos do annoy me), but because this can be a sign that the information itself is similarly treated with half-attentive care.

In this case, after having looked at several other sites on the topic, I can tell you that Creature Control has done a really good job with the facts here.  Other sites might say the same sort of thing more succinctly or with better spelling, or with a little more detail, but they all back up what Creature Control says here.

So I would keep this page open as one of my tabs to refer to, and then I would go back to my search results and look at the next one that catches my eye.

Let's look at a page that I would consider more authoritative.


 
Screenshot of Michigan State University's Diagnostic Services page on Multicolored Asian Ladybeetles


A lot of state universities in the US have what are usually called Extension services.  These are departments or branches within the university, usually associated with agriculture or farming, which provide information and assistance to the public.  My mom used to call the MSU extension in our city (she was an MSU grad) whenever she found a bug she couldn't identify in the house or in the yard.  They were glad to know about these bugs because it helped them in their research to know which bugs were appearing where, and they would tell my mom what kind of bug it was, was it a good or a bad thing that it was in the house or the yard, and what should she do if she wanted to get rid of it.

That was before the days of the internet.  Now, instead of calling their offices, you can search their website.

So I happen to know that these Extension Service people, and by extension (hah!) their websites, can be very helpful in explaining their bug & plant research to the public.  This Diagnostic Services page looks like it might be a service like that.  So I am likely to grant them a lot of credibility.

They are referring to these ladybugs by their species name (Harmonia axyridis), and they are also giving several common names for it: ladybeetle or ladybug in general, and Multicolored Asian Ladybeetle in particular.

The problem is, there is a TON of information here.  They're talking about what the larvae look like, and how the eggs hatch, and what the pupae do.  In most cases, I omit this level of detail.  Only if there were some bizarre fact that I think it would be fun to point out (let's pretend, for example, they said that the larvae of these ladybugs drive cars by the time they're two weeks old!), then I might include something from that panoply of detail.  But if it isn't related to the topic at hand, or if it doesn't shed more light on the whys and wherefores of the topic at hand, then I'll skip passing it on to you.

But let me show you one paragraph of theirs.  Here you can see they provide a lot more detail about the introduction of this ladybug into the United States than the Creature Control page did:

The multicolored Asian lady beetle is a native of Asia. There were several attempts to introduce the beetle into the southeastern and southwestern portions of the United States to help control aphids on pecan trees back in the late 70’s. Some say that none of these deliberate attempts succeeded, but that the beetle became established after ‘jumping ship’ somewhere along the Gulf coast. Since then it has spread rapidly throughout the US and southern Canada. It was first found in Ontario in 1992. Despite popular rumors, the beetle was not released by the DNR, MSU, or chemical companies. One reason that might explain their large numbers is our newest aphid pest, the soybean aphid. This aphid was discovered in Michigan and other Midwestern states during the summer of 2000. Thousands of these aphids can occur on a single soybean plant and the Asian Multicolored Ladybeetle is taking advantage of this unlimited food source. Soybean aphid populations were very high in 2001 (the last time we saw large populations of the Asian lady beetle), and again this past summer. Many of the soybean plants examined at our diagnostic lab have had a dozen or so ladybeetle larvae munching away on the hapless aphids. There are multiple generations of the beetle during the summer and the adults can live up to three years. [Michigan State University, Multicolored Asian Ladybeetle]

They say the ladybug was introduced in the late 1970s.  Creature Control said 1988.  So I'm going to have to find other sources that confirm either one of those dates, or perhaps clarify why these two sources differ.

But they also say this ladybug was introduced to control aphids, and they also mention pecan trees (Creature Control mentioned the pecan trees in another paragraph on their page), so I'm feeling pretty good about those two basic facts.

This paragraph really loses its focus, though. It starts out talking about the introduction of the ladybugs, moves into the aphids that the ladybugs ate, then the soybean aphids in particular (when was "this past summer," exactly?), and finishes with how many generations of the beetle live in one summer.  These facts might all be solid and verifiable, but it will take some organization to straighten out all this information.  Whoever wrote this probably knows a ton of stuff, and gets excited about what they know, and all their knowledge is linked together, so it all comes out in one great lump.

I don't want to go through every single source exhaustively because I think you would get profoundly bored, but let me show you one more in particular.  Let's talk about that Wikipedia entry.



Screenshot of the Wikipedia page on Harmonia axyridis, the kind of ladybug that bites


There's lots of general information up front about the colors of this ladybug -- "ranging from yellow-orange to black" -- and how many spots -- between 0 and 22 -- which is helpful information, but again, details I would verify elsewhere.

But let's look at their paragraph about how these ladybugs were introduced into the United States, so we can compare it to our other two sources:

This species became established in North America as the result of introductions into the United States in an attempt to control the spread of aphids. In the last three decades, this insect has spread throughout the United States and Canada, and has been a prominent factor in controlling aphid populations. In the US, the first introductions took place as far back as 1916. The species repeatedly failed to establish in the wild after successfully controlling aphid populations, but an established population of beetles was observed in the wild near New Orleans, Louisiana, around 1988. In the following years, it quickly spread to other states, being occasionally observed in the Midwest within five to seven years and becoming common in the region by about 2000. The species was also established in the Northwest by 1991, and the Northeast by 1994, aided by additional introductions from the native range, rather than just reaching there from the Southeast. Reportedly, it has heavily fed on soybean aphids (which recently appeared in the US after coming from China), supposedly saving farmers vast sums of money in 2001. [Wikipedia, Harmonia axyridis]

Pretty much dovetails with the other two. Wikipedia says the first introduction of this ladybug happened as long ago as 1916.  Hmm, when three sources disagree so wildly, that either means somebody is choosing to omit some instance in favor of another for some reason, or else that nobody is really sure, and so I won't be able to give you an exact, definite date.

There's that date 1988, though, when an established population was observed in the wild near New Orleans.  I think that might be what MSU meant when they said the ladybugs "jumped ship" near the Gulf Coast.  We've got references here, too, to the soybean aphid, and the timeline seems to match up pretty well with that of MSU's.

So I would say this entry, or at least this paragraph from this entry, seems to be pretty reliable.  I'll want to look at a couple more pages, though, to verify further and round things out.

Further Digging -- and Finding Gold

But the real reason I want to look at the Wikipedia entry is because of the sources.  Here is their sub-section on the Biology and Behavior of this bug:



Screenshot of the Biology and behavior subsection in Wikipedia's entry on Harmonia axyridis, the ladybug that bites


Since you have to click on that screenshot to read it, unfortunately, let me reproduce a bit of it to show you what I'm after:

These insects will "reflex bleed" when agitated, releasing hemolymph from their legs. The liquid has a foul odour (similar to that of dead leaves) and can cause stains. Some people have allergic reactions, including allergic rhinoconjunctivitis when exposed to these beetles.[1] Sometimes, the beetles will bite humans,[1] presumably in an attempt to acquire salt, although many people feel a pricking sensation as a lady beetle walks across the skin, which is just the pressure from the ladybird's feet. Bites normally do no more harm than cause irritation, although a small number of people are allergic to bites.[15]

First of all, you'll note that these ladybugs release a chemical that stinks like dead leaves, and to which some people are allergic.  (Allergic rhinoconjunctivitis is pink eye, by the way.)  Included here is that sentence that mentions the possibility that ladybugs might only be biting for salt.  I really want to verify that with another source, and this is where I want to point out those footnote reference numbers.

Those note numbers will take you down to the references at the bottom of the Wikipedia entry, which are then linked to the site where that source material appears.  In the case of these references, many of the links have died or moved or gone away, so it's turned out to be a little tricky tracking down the sources for the information here.  And you'll notice that the statement about doing it for the salt is not footnoted.  So where that bit of information came from, I'm not sure.

But the sources do turn up a goldmine.  Among other things, there's a link to an article published in The Journal of Insect Science in 2003 -- it doesn't get much more authoritative than that on the free internet -- and that article talks about how these ladybugs are cannibals!  Not that cannibalism is that rare in the insect and animal world, so it's more the shock value of saying "these ladybugs are cannibals!"  But that little fact is definitely going into the entry.  Nothing to do with the biting of people, but it's one of those bizarre tidbits I like to pass along.

The article also talks about how these ladybugs are especially problematic in vineyards, because they love to eat lots of fruit including grapes, and so they'll swarm on the grapes and vines, and the harvesters can't help but crush the ladybugs along with the grapes in the harvest.  Now that's a problem and a half.  Again, something else I would include.

They also confirm a lot of things that are in that Creature Control page.  I guess Google knew what it was doing, after all, when it put that Creature Control page high on its results list.

Nothing in this article about the salt, though.  So I'm going to do a search on Harmonia axyridis and salt.

The results?  That Wikipedia entry, another page that has copied the Wikipedia entry verbatim, another page that happens to mention Salt Lake City.  Otherwise, nada.

OK, what about Harmonia axyridis and sodium?  All I get is an extremely technical page describing what I think are the genomic and protein structures, which include sodium, in this species.  Nothing about biting, nothing about the bugs themselves wanting to eat salt.

But I do find on another University Extension Service page, this time from Minnesota, this sentence:

These bites are incidental, as the beetles are presumably searching for moisture or food.

That "presumably" word makes me suspect that this page might be the source for what Wikipedia said about salt.

Regardless, I am not going to repeat Wikipedia's supposition about salt, since I can't verify it, and since whoever wrote that entry didn't cite their source.  But I do feel pretty confident about saying what Minnesota's Extension Service said that maybe the ladybugs are looking for sustenance.  But I'll be sure to include that "maybe."

So this gives you an idea of how I go back & forth between sources, how I use them to verify or confirm each other, or to connect me to additional sources for further information.  I often discover further clarifications or qualifications the further I dig, and sometimes those clarifications that people omit when they're trying to generalize turn out to be pretty interesting. 

What's Next

I wanted to present you with a complete entry at the end of this discussion, but I have to go to bed now. So maybe what I'll do instead is give you the text of the entry next time, maybe with some meta-discussion about how I made decisions here & there about what to include and in what order.  Because finding the images takes a long time, I imagine talking about the images will have to be yet another entry of its own.

Sorry there weren't more pictures with this one.  Here, I'll put in one last picture.


This is the biting kind of ladybug.  You can identify it as such by the M or W shape on the lighter colored back of its neck behind the head.
(Photo from Maclean's in CA)


Sources
[since I'm giving you the behind-the-scenes view, I'll give you the plain, unadulterated URLs]

http://www.creaturecontrol.net/Asian%20Lady%20Bug
http://www.pestid.msu.edu/insects-and-arthropods/multicolored-asian-ladybeetle/
http://en.wikipedia.org/wiki/Harmonia_axyridis
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC524671/
http://www.extension.umn.edu/garden/insects/find/multicolored-asian-lady-beetles/
http://www.biocontrol.entomology.cornell.edu/predators/Harmonia.php

Sunday, April 19, 2015

Apple #708: Behind the Daily Apple -- Constructing a Search Query

After my last entry (on movie trailers), Daily Apple reader Mahalia wanted to know how your Apple Lady does what she does.  As she put it, how does an entry get researched and written, and how do I find the answers to people's questions.

I told her I Google it.

Which made her laugh and say, OK, but it seems like my entries are more focused and thorough than a list of Google results, and there are complementary visuals, too.  So I said I was being flip, that there is more to it than just typing words into a Google search box.

And actually, when I've looked things up for other people, they'll say, "How did you find that?"  I'll show them the search I did, and they'll say, "I wouldn't have thought to do it that way."  I don't think what I do is anything particularly magical, in fact it seems pretty obvious to me, but then, I learned this skill in library science school years and years ago, so at this point, it is second-nature to me.  Not everyone knows how to put together a good search query.  And since that is where every Daily Apple begins, let me start there.




The Google search box. The secret to your Apple Lady's success.


Google

  • There are lots of search engine websites out there -- Google, Bing, Dogpile, Ask, even AOL -- but I search Google.  It's the biggest.  Meaning, Google's web crawlers hit more pages of the internet than anyone else's do.  So with one search, I get results from the highest number of pages possible.
  • Dogpile is a meta-search that searches several search engines at once, so you would think that would get more inclusive results.  But in my experience, Google beats them anyway.  And by "beats them," I mean gives you a wider variety of results which are of better quality.
  • By "better quality," I think that might be best demonstrated when I show you the results of some sample searches.
  • Another reason I like Google is it allows you to use some advanced searching tools, like quotation marks, a symbol that means "or", and it also does automatic truncation.  What I mean by that will become apparent in a bit.

Constructing Your Query

  • The first part of figuring out what to type in the search box is deciding what you're looking for.  This sounds like a "duh" moment, but as in all things, it really does help to define to yourself what you're doing before you doing it.
  • Here's an example.  Once upon a time, a Daily Apple reader Dan asked me to do an entry on disposable lighters.  Specifically, he asked, 
What about doing a daily apple entry about the disposable lighter? i know zippos and other butane lighters have been around for awhile, but what about the plastic disposable Bic? So ubiquitous these days.
  • So I Googled "disposable lighters."  The first thing to note here is that I typed in my search in quotations marks.  As you see it here, I typed into the Google search box "disposable lighters" as opposed to disposable lighters.
  • This is important because the quotation marks mean I've told Google I want it to search for those two words next to each other, as a phrase (a.k.a. phrase searching).  If I had omitted the quotation marks, Google could have returned any results with disposable in one part of the document and lighters anywhere else in the document.  So I could have gotten results that might have had nothing to do with disposable lighters at all.
  • Google is smart enough, though, that even if I had omitted the quotation marks, it would automatically put the hits where disposable is next to lighters at the top of the list.  Not every search engine does that, and this is another reason to prefer Google -- it automatically ranks its results for you so that the results that most closely match what it thinks you want are put at the top of the list.
  • But for our purposes, the results I got from Googling "disposable lighters" weren't the best for putting together a Daily Apple entry.  Which is to say, the things that came up first were all shopping-related.  Links to pages where they're sold on Amazon, and other online stores.  It would be no fun to read a Daily Apple entry on "Where can I buy disposable lighters, and how much do they cost?"  That wouldn't be answering what Dan was asking, either.
  • So how should I narrow the "disposable lighters" field?  Since Dan's question referred to other lighters that had been around for a while, I chose to investigate the history of them -- when were they invented -- and how had they become so popular.
  • So I modified my search to "disposable lighters" history. 
[Editor's note about screenshots: Blogger does not allow screenshots to be copied & pasted into an entry. That would be way too easy.  They must be saved as images & uploaded. Doing that changes all sorts of things about the image -- mainly makes them too small. So if you want to read what the screenshot-images actually say, you have to click on them to see them in an englarged photo viewer. When you're done, click the x in the top right corner of the photo viewer to return to the blog.  Total pain in the behind, I know. But you can thank Blogger for this.]

Today's Google search results for "disposable lighters" history. One of the things I don't like about Google is that they pay their bills by putting paid-for links to product purchases at the top of their search results. This would be like walking up to your friendly local librarian and saying, "I need to know about the history of aspirin," and before telling you where to find information about that, she would whip out a bunch of samples and ask you rapid-fire, "Would you like to buy this aspirin? This other kind of aspirin is very popular, perhaps you would like to buy some of that instead." And you would answer her, "But I just want to know about aspirin."

Google has also taken to providing you with images that match your search results, which can come in very handy sometimes.  For example, if I've seen a type of bird and I type in what I think is its name or else my description of it, those images will show me if I've got the right name for the bird I've seen or will help me choose among slightly different images for the one that matches most closely with the bird I saw, and that will take me to a page that tells me all about that bird.  In other words, the image search sometimes helps me narrow my choices by kind or species.

Finally, please to note the hit that comes up at the bottom, just before the image results.



Natural Language Queries

  • Some people type in their searches as natural language requests, meaning they simply type in the sentence that is their question: when were disposable lighters invented? Or, what is the history of disposable lighters?
  • Back in the day, when online databases were all proprietary and expensive, they worked only literally.  If you typed in a search like that, you would be telling the database to find every instance of the word "what," every instance of the word "is," every instance of the word "the," and so on, and then combine the results to show you only those items that contained each one of the words in that query, regardless of where they happened to appear in the document.  Can you imagine, searching the entire internet for all pages that have the word "is"? 
  • So we online search librarians learned to ask the databases to search for only those words that are the most important.  We would leave out the little, omnipresent words like "is" and "the" and "where."  In fact, some databases got smarter and wouldn't even search for those words. Because they appear so often, it would take the database forever to retrieve them all, and it wouldn't even be that useful to list them.  These little omnipresent words are therefore often referred to as stop words.
  • Let's remove the stop words from our natural language request and see what we've got left: what is the history of disposable lighters?  What remains is history "disposable lighters."  This, by the way is known as a keyword search.  This is the kind of search I do most often.
  • The clever folks at Google have over the past several years made their search engine smart enough to handle natural language requests.  So if you were to type what is the history of disposable lighters into the Google search box now, you would likely get pretty good results.   

Today's results for a natural language search in Google for what is the history of disposable lighters.

  • Well, that's interesting. There are no paid-for links at the top of these results.  There are no Image results, either.  News about Google reports that they've been putting a lot of effort into making their search algorithms even better at processing natural language queries. This is because a) most people tend to type in natural language queries and b) lots of people have smart phones, and they're relying on apps to give them search results in very targeted areas. So Google has to make their search engine competitive and give better, more targeted and focused results for natural language queries.
  • So maybe my technique of using keyword searching is becoming outmoded.  I do use natural language queries occasionally. If a topic is especially arcane and I'm having trouble finding something with just keywords, I will give a natural language query a shot, to see if I get different results, or maybe one hit that's relevant that will give me some more information that I can then use as the basis for another, better search.
  • But based on these results here, maybe I should try the natural language method more often.  But this is only one example, and in general, I feel like I have better control over the results I get with a keyword search, so probably for some time longer at least, I'll continue to use the keyword method.  

Why it's Important to Define Your Question

  • But now let me get to the ultimate point I was trying to make with this disposable lighters example.  After I posted that entry on disposable lighters and Daily Apple reader Dan had a chance to read it, I asked him what he thought, if it answered his question.  He said actually, no, it hadn't.  
  • He said what he was really thinking of was those great big huge rafts of plastic that are floating around the oceans, how there are often plastic disposable lighters among those piles of floating trash, and how birds or fish eat them and are killed.
  • Well, that is a far more depressing and specific topic than the question he gave me.  But he did not tell me anything about that.  He did not narrow his topic further what exactly it was about disposable lighters he wanted to know.  Based on the question he sent me, I assumed he wanted a general history.  
  • As every good reference librarian knows to do, I should have asked him to verify my assumptions, and perhaps narrow the topic more specifically.  I should replied to him saying, "I think what you want to know is the history of disposable lighters, right? Who invented them and when and so on?"  And he could have answered back with, "Well, really what I want to know about is how all those plastic lighters get in the lakes of floating trash."  Then I would have researched that.  And we would have a very different Daily Apple about disposable lighters.  I probably would have called it Lakes of Floating Trash or something else, rather than Disposable Lighters.
  • But this is important for you to keep in mind, too, as you're searching the web.  Let's say you're out with your friends and you're talking about lions, and someone says, "You know, it's the females who do all the hunting." And someone else says that's not true, the males hunt too, and someone else says they do not, the males are useless, and it gets rather heated and gender-angry.  You want to put a stop to this by finding the answer on your smart phone about who does the hunting among lions.  Everybody's kind of drunk, and the argument is starting to spiral out of control, so you want to find the answer quickly.
  • If you were to type lions into your smartphone's Google search box, you would get a mishmash of everything in the world about lions. And I mean all kinds of lions.



Today's results of searching Google for the word lions.

  • You'd get the link to the Detroit Lions homepage, news about lions, news about Lions Clubs -- nothing even close to what you want to know.
  • So you have to go back and ask yourself what it is about lions that you want to know.  You want to know which lions do the hunting, the males or the females.  
  • Cross out all the little words in that sentence, and what do you have left?  Which lions do the hunting, the males or the females?  I haven't crossed out the or because that is a very useful word.  More on that in a minute.
  • Let's pretend you are in such a hurry, you decide you just want to know about lions and hunting. So you type lions hunt into the Google search box.



Today's results of searching Google for lions hunt.

  • Well, this is better. The very first hit is a link to a video showing male lions hunting and making a kill.  So that could be your answer right there.  You might be satisfied with saying, "Look, here's a video that shows male lions hunting.  Proof positive, male lions hunt!"
  • But someone could come back and say, "Oh yeah? Well, that's just one video. That probably hardly ever happens.  The females do most of the hunting.  Disney told me so."
  • That link toward the bottom of the page, How lions hunt, seems promising.  You click on it (as I did) and skim it but discover that it says nothing about males or females.  It has a lot of interesting information about how lions stalk their prey, how they aren't that fast so they have to hide and lie in wait for a long time, how they are very, very patient, waiting until the prey wanders close enough, and then they rush out and pounce.  It even says that lions don't even use wind direction in their favor all that much, they're just patient and they time their pouncing very carefully.
  • All that is very interesting, but it doesn't answer the question at hand.  And meanwhile, your group is getting more heated, starting to raise their voices, so you need to find an answer, and quickly.
    • Before I continue, I want to insert a note about automatic truncation.  You'll notice that Google automatically provided results that use all sorts of versions of the word hunt -- hunting, hunts, hunted.  
    • I don't know whether Google incorporated hunter and hunters or not, or if those results were less relevant and so appear farther down the list.  It could be that Google is smart enough to know that lion hunters is a different thing that lions hunting and sorted the search results accordingly.
    • But the fact that Google automatically searched for these various forms of the same word means that it automatically translated hunt into hunt*, where the * stands for any suffix that might follow.  This is called automatic truncation.  It can save you a lot of time and forethought, combining all sorts of relevant results that you might not have thought to gather on your own.
    • Another thing Google sometimes does is to automatically search for synonyms for your search terms.  In this case, it looks like it also searched for attack as a synonym for hunt.  More on that in a bit.
    • Let's pretend for a minute that you typed lions hunters, and you wanted Google to search for only those words exactly.  Let's pretend you didn't want it to find any synonyms or alternate versions of either of those words.  Then you would tell it to do a Verbatim search.
    • To do a Verbatim search, click on Search tools. In the mini-window that appears, choose Verbatim. That's it.  Google will then search only for the words you've entered as you've typed them.


Screenshot of how to do a Verbatim search. You'll have to click on the image to see it clearly (thanks, Blogger, for the obfuscation).
 

Synonym Searching -- The Magic of Or

    • But let's get back to that concept that just came up, which is searching for synonyms. Let's pretend Google didn't search automatically for synonyms, but that you wanted it to do so.  Let's pretend you were thinking expansively, as old-school online searchers know to do, and you knew that there might be pages that might not use the word hunt, but maybe they'd use a different verb that means something similar, so those pages would still be relevant.
    • In order to find those other pages, you would need to think of all sorts of verbs that mean roughly the same thing as hunt--attack, kill, stalk, chase. You would want to find all the pages that use any of those verbs: hunt or attack or kill or stalk or chase.  I'm emphasizing the word or here because it's important -- because of set theory.
    • When you say or, you mean you want both.  Remember in elementary school when you learned about the intersections and unions of two sets?  Intersections are when you want to know about the place where the two sets overlap and only where both things are true -- in search language that's and.  Unions of two sets are when you want to know about everything in both of the sets together.  In search language, that's or.
    • If you just typed or into your search string, Google or any search engine might not know that you mean that word as a search function.  It might think you want it to search for the actual word or, or it might treat it as a stopword.  In order to eliminate that confusion, most databases substitute a special character that stands for the set-theory-meaning of or.  In Google's case, that special symbol is |.
    • On a typing keyboard, you can find that symbol above the \ key.  You would type [shift +] \ to get |.  On a smartphone keypad, I have no idea where that symbol is.  Those keypads are annoying and impossible to use anyway and I hate them with an electronic passion.  Ahem.
    • So if you wanted to find lions hunt or attack or kill or stalk or chase, you would type that into Google as lions (hunt | attack | kill | stalk | chase).  You would group all the synonyms together into parentheses because search engines work like math.  
    • Remember how in basic algebra, you learned that you're supposed to do the math of the things in the parentheses first, and otherwise do the math from left to right?  Well, without the parentheses, Google would do the search math from left to right.
    • If you were to type in lions hunt | attack | kill | stalk | chase, instead of searching for lions and [all the rest of the verbs that mean to hunt], it would search for lions and hunt, or any pages that use the word attack, or any pages that use the word stalk, or any pages that use the word chase.  So you would get a whole bunch of pages on attacking in general, stalking in general, chasing in general, and there might be a few about lions hunting.  Which is not what you want at all.
    • Now, as we've learned, Google is smart enough to know not to do this -- most of the time.  But you can't always count on Google being smart enough to know what you mean every time.  So you'll get the best results the most often if you use the terminology and search strategy that is certain to get you the more accurate results.  Otherwise, it could be a case of garbage in, garbage out.
  • Now that you know about the magic of or and the importance of parentheses, you know to refine your search about lions hunting thusly:  lions hunt (males | females).
  • You enter your search this way because you want to find out about how lions hunt, specifically in regards to gender, whether they're female or male.  You don't want to know about how only male lions hunt, or only about how female lions hunt, you want to know about either one.  So you know to use the operator that means or.  You also know to put males and females in parentheses so that Google will know to or those two concepts together, and then and them with the concepts of lions and hunt.
  • So as I've said, you type lions hunt (males | females) into the Google search engine box. The results?  Bingo.


Today's results of searching Google for lions hunt (males | females)

  • Here, in stunning, tiny display are the results.  As you can see when you bring the screen really close to your face and squint hard, our first two hits are from Wikipedia and Yahoo Answers. They both say that both male and female lions hunt.  But you know that Yahoo Answers isn't all that authoritative, and Wikipedia can be a good place to start for information, but you should always verify anything you find there with at least one other source.
  • You notice a link to a UPI article from 2013.  You know that UPI is generally a very reliable news source, and 2013 is pretty recent. The headline says that what we thought was true about male lions has been shown to be something different, which suggests there will be a nuanced answer here.  You know that the truth often is nuanced, so this looks like a promising result.  
  • You click the link to the UPI article (as I did) and you discover that researchers have learned that female lions hunt, and so do male lions.  The difference is that female lions hunt cooperatively, in groups, at night, and in areas of open vegetation.  Male lions hunt less cooperatively, often solo, during the daytime, but in areas with lots of vegetation where it is easier for them to hide and stalk their prey. 
  • You could make a lot of guesses or assumptions based on this information.  Maybe the reason we thought only female lions hunt and males don't is because it's easier for us to see females hunting in groups in the open, even if it is in the dark of night.  We didn't see the male lions hiding in the brush, so we didn't think they were hunting.  Or maybe, because of changes in habitat (probably human-induced) there is less vegetation in which the males can hide and stalk prey, so we don't see them doing this as often, and so we assumed the males don't hunt.
  • If I were doing a Daily Apple on this topic, I would investigate those further guesses and assumptions and report on those findings to give you some more explanation and context.  
  • But since you are sitting among your group of friends who are getting more heated over this argument about male and female lions hunting, and it's about to break into some kind of gender war, this is enough information for you.  You announce to the group, "Male and female lions both hunt.  They just do it differently."
  • Your friends, who have worked themselves into a state where they are ready for a full-blown fight are a little disappointed not to have the fire to further their fight, that instead the water of truth has doused their ire.  But they know the right answer when they hear it, and they settle down.


Male lion hunting an eland. People tend to believe pictures more than words, so here's a picture.  We'll talk about how to find images in the next entry.  The eland got away, by the way.
(Photo from Africa Geographic)

Here's a female lion hunting a water buffalo, by herself, in the daytime.  Maybe there are other female lions nearby, I don't know.  But it's probably a good idea not to generalize too much about animal behavior.  As we who are animals ourselves know, we don't always conform to type. Especially where gender is concerned.
 (Photo from Facts Legend)

What You Know Now

  • Now you know how the Apple Lady begins working on every Daily Apple question.  
  • More importantly, now you know how you can construct a targeted search yourself.
  • Even if you didn't absorb all that stuff about putting your search terms in parentheses, or using the | character, or searching for synonyms, you probably absorbed the fact that Google is pretty smart.  Smart enough to parse your search for you so that, most of the time, even if the terms you type in are kind of off the mark or not that precise, Google's search logarithms will compensate for that and give you good results anyway. 
  • Maybe you even took in the fact that if you typed in your search as a question, the same way you might ask Siri a question (she's just a pretty voice on top of a search logarithm anyway), Google will give you pretty decent results.  And maybe even with fewer ads.
  • Hopefully you have also gleaned a much bigger-picture concept, which is that looking up information online is pretty easy.   In fact it's pretty dang easy, thanks to the internet and extremely well-engineered search engines like Google, to find out the answers to all sorts of questions.  But you probably already knew that from how often you use your smartphones to settle debates among your friends. 
  • But it's the internet and search engines that make all this possible.  Where once upon a time you had to go to the library -- and I dearly love libraries -- and look things up in dictionaries and encyclopedias and card catalogs -- and I dearly love all those things -- now you don't have to wait until the library is open and go there and search through those books -- oh, how I love books -- to slowly compile the answer to a question.  
  • Now all you have to do is type a few words into a box and push a button, and you get a raft of answers, links to a huge array of knowledge that has been built and constructed by scads of people over the course of decades.  I mean, when you think of the enormity of information at our disposal, the amount of people-hours and person-thought that's gone into all the knowledge that pops up almost immediately from one search, and you can tap into all of that and learn from it so quickly, it is beyond staggering.  It gives me the chills.
  • And we are only going to learn more.  We are only going to get better at this business of learning and teaching ourselves stuff.  If one day we learned things the way they do in the first Matrix movie, where they plug Neo into a machine, push a button, and he jerks a bit, and then he says, "I know kungfu," I think I would melt with delicious joy.  That sort of method of learning is probably not even that far in the future.  The more we know, the easier it is for us to learn more.  I'm repeating myself all over the place because I'm so excited thinking about it.
  • The more immediate point here is this: what I do here on this blog is not that difficult.  You can find the answers to your questions so easily.  Type a couple words into Google and push Search.  See what you find.  I guarantee you'll learn something.  It'll be good.