# Too-smart search engines



## wblock@ (Nov 15, 2011)

Is there a web search engine that will not filter what I enter and actually let me search for things with backslashes in them?

For example, I see some suspicious stuff in the Apache logs, a request for a "file" that is pretty obviously some type of exploit starting with "\xd8\x9b".

Google filters out the backslash.  Always.  With quotes or +, too.


----------



## expl (Nov 15, 2011)

I do not know what filtering you are referring to, but google does not filter backslashes.
And combination of these two bytes \xd8\x9b makes unicode symbol for ARABIC SEMICOLON.


----------



## Anonymous (Nov 15, 2011)

wblock@ said:
			
		

> For example, I see some suspicious stuff in the Apache logs, a request for a "file" that is pretty obviously some type of exploit starting with "\xd8\x9b".



Did you try to embrace your search in double quotes?

For example I entered "\xd8\x9b" including the quotes into the search field of Safari (Mac), and this created the following google URL:

http://www.google.com/search?client=safari&rls=en&q=%22%5Cxd8%5Cx9b%22&ie=UTF-8&oe=UTF-8

Guess what, Google found this thread as the first hit.

Anyway, also double-quoting the search string does not always work for me as I intended to have it work.

Best regards

Rolf


----------



## wblock@ (Nov 15, 2011)

expl said:
			
		

> I do not know what filtering you are referring to, but google does not filter backslashes.



Try searching on mod_security \x.  The backslash is ignored, whether in quotes or preceded by a + or both.  As rolfheinrich points out, using a longer string helps.  Look at the bolded entries in the results; the backslash is still being filtered out before searching.



> And combination of these two bytes \xd8\x9b makes unicode symbol for ARABIC SEMICOLON.



The complete request string is

```
"HEAD /]\xd8\x9b\x1b\xd8\xda\xcb\xd9\x1b\xd8\xdc\xcb\xdc\x19\x19\x8b\xdc\x1e\x19K\x9c\x19\x19 HTTP/1.1"
```


----------



## Bobbla (Nov 15, 2011)

wblock@ said:
			
		

> With quotes or +, too.



Don't know how relevant this is to what you are looking for, but google has changed their search a little compared to how it used to be.

http://www.ghacks.net/2011/10/25/google-replaces-search-operator/

Hope that helps a little, if not I'm sorry to have wasted your time.


----------



## Anonymous (Nov 15, 2011)

wblock@ said:
			
		

> ... The backslash is ignored, whether in quotes or preceded by a + or both ...



Sorry if I am too ignorant to see your point, however, the Google search engine does exactly what I would expect it does on backslaches, it URL encodes it. So, if I enter a single backslash into the search field of http://www.google.com, it produces the following URL:


```
http://www.google.com/search?client=safari&rls=en&q=%5C&ie=UTF-8&oe=UTF-8
```

%5C is the URL encoded backslash (ASCII 0x5C), and as a matter of fact, the first hit of the search is the wikipedia page explaining BACKSLASH.

So the backslash is not filtered out but URL encoded, IMHO, this is simply how it works.  

Best regards

Rolf


----------



## Crivens (Nov 15, 2011)

Totally on-toppic and off-helpful in this context, but I remembered and hat to smile.


----------



## wblock@ (Nov 15, 2011)

Bobbla said:
			
		

> Don't know how relevant this is to what you are looking for, but google has changed their search a little compared to how it used to be.
> 
> http://www.ghacks.net/2011/10/25/google-replaces-search-operator/



I didn't know about that.  Thanks!


----------



## wblock@ (Nov 15, 2011)

rolfheinrich said:
			
		

> Sorry if I am too ignorant to see your point, however, the Google search engine does exactly what I would expect it does on backslaches, it URL encodes it. So, if I enter a single backslash into the search field of http://www.google.com, it produces the following URL:
> 
> 
> ```
> ...



This also looks to be context-sensitive.  I can't get a search for "\x" to search for those literal characters.  The backslash is in the URL (although not encoded here), but it's not in the results, which are identical to just searching on "x".  So rather than saying the backslash is filtered out, it just isn't included in the comparison.

Bing does that also, although it's not as smart about context (or anything, really) and treats a single backslash as nothing.  (Like Silverlight, I suspect Bing is just a pretend implementation so somebody can say "See, we've got that too!")


----------



## tingo (Nov 16, 2011)

I tried searching for the HEAD... string
Let's see: ask.com, dogpile, kvasir.no (Google-powered) metacrawler search.com and WebCrawler shows this thread, Alexa, altavista, gigablast, Lycos  and search.yahoo.com doesn't find a thing, ixquick filters away anything after head...
The strangest result came from WolframAlpha.
What others?


----------



## Carpetsmoker (Nov 16, 2011)

DuckDuckGo:
http://duckduckgo.com/?q=\xd8\x9b


----------



## Slurp (Nov 16, 2011)

Carpetsmoker said:
			
		

> DuckDuckGo:
> http://duckduckgo.com/?q=\xd8\x9b



You beat me to it.


----------



## graudeejs (Nov 16, 2011)

I was just about to post about DuckDuckGO


----------



## fonz (Nov 17, 2011)

Carpetsmoker said:
			
		

> DuckDuckGo


Looks promising! It's perfectly possible to stay away from Google in many ways, but web searching is a toughie.

Fonz


----------



## Slurp (Nov 18, 2011)

fonz said:
			
		

> Looks promising! It's perfectly possible to stay away from Google in many ways, but web searching is a toughie.
> 
> Fonz



DDG is usually great, but 1 search in 50-150 doesn't produce useful results. I resort to Bing then and sometimes to Google too. Simply, Google database is the biggest around and DDG is near the other side of spectrum. So I think that it's hard to really leave Google search engine, but reduction of traffic to a fraction of % is OK for me.


----------



## Carpetsmoker (Nov 19, 2011)

DuckDuckGo works for >90% of the searches for me.


----------



## ramonovski (Nov 19, 2011)

DuckDuckGo works for >90% of the searches in English language for me.

Searching at DDG in my native language (Spanish) is (sadly) a joke.


----------



## fluca1978 (Nov 21, 2011)

DuckDuckGo works fine for me, and most important it does not trace which/where/when search I'm doing.


----------



## Crivens (Nov 21, 2011)

For those of you who need the google but do not want to give them a finger print and iris scan each time you look at the screen (well, a print from a different part of the anatomy would ... never mind) should look at scroogle. They are a proxy for google, do not keep access logs (their claim), do not set cookies (have not seen them trying), and do not serve ads as a side order.


----------



## Slurp (Nov 23, 2011)

Too bad that you can't create an url for particular search in scroogle.
Right now when I want to use google, I type "g search_term1 search_term2", which my browser translated to "www.google.com/search?q=search_term1+search_term2" or something like this. I wanted to change the meaning of g to use scroogle instead, but it doesn't seem possible.


----------



## draco003 (Nov 23, 2011)

Perhaps give *kngine* a try

http://www.kngine.com/Search?q=mod_security+%5Cx


----------



## YZMSQ (Nov 24, 2011)

draco003 said:
			
		

> Perhaps give *kngine* a try
> 
> http://www.kngine.com/Search?q=mod_security+%5Cx


Seem its search results are dated? I search FreeBSD there, and see this:





We're living in 2007 now? :O


----------



## Slurp (Nov 24, 2011)

Yeah, whatever volatile information I search, it's out of date.


----------



## draco003 (Nov 24, 2011)

OOPS ...
`# history -c`

=P


----------

