How to Speed up Search Engine Indexing

It’s a common knowledge that nowadays users don’t only search for trusted sources of information but also for fresh content. That’s why the last couple of years, the Search engines have been working on how to speed up their indexing process. Few months ago, Google has announced the completion of their new indexing system called Caffeine which promises fresher results and faster indexation.

The truth is that comparing to the past, the indexing process has became much faster. Nevertheless lots of webmasters still face indexing problems either when they launch a new website or when they add new pages. In this article we will discuss 5 simple SEO techniques that can help you speed up the indexation of your website.

1. Add links on high traffic websites

The best thing you can do in such situations is to increase the number of links that point to your homepage or to the page that you want to index. The number of incoming links and the PageRank of the domain, affect directly both the total number of indexed pages of the website and the speed of indexation.

As a result by adding links from high traffic websites you can reduce the indexing time. This is because the more links a page receives, the greater the probabilities are to be indexed. So if you face indexing problems make sure you add your link in your blog, post a thread in a relevant forum, write press releases or articles that contain the link and submit them to several websites. Additionally social media can be handy tools in such situation, despite the fact that in most of the cases their links are nofollowed. Have in mind that even if the major search engines claim that they do not follow the nofollowed links, experiments have shown that not only they do follow them but also that they index the pages faster (Note that the fact that they follow them does not mean that they pass any link juice to them).

2. Use XML and HTML sitemaps

Theoretically Search Engines are able to extract the links of a page and follow them without needing your help. Nevertheless it is highly recommended to use XML or HTML sitemaps since it is proven that they can help the indexation process. After creating the XML sitemaps make sure you submit them to the Webmaster Consoles of the various search engines and include them in robots.txt. So make sure you keep your sitemaps up-to-date and resubmit them when you have major changes in your website.

3. Work on your Link Structure

As we saw in previous articles, link structure is extremely important for SEO because it can affect your rankings, the PageRank distribution and the indexation. Thus if you face indexing problems check your link structure and ensure that the not-indexed pages are linked properly from webpages that are as close as possible to the root (homepage). Also make sure that your site does not have duplicate content problems that could affect both the number of pages that get indexed and the average crawl period.

A good method to achieve the faster indexation of a new page is to add a link directly from your homepage. Finally if you want to increase the number of indexed pages, make sure you have a tree-like link structure in your website and that your important pages are no more than 3 clicks away from the home page (Three-click rule).

4. Change the crawl rate

Another way to decrease the indexing time in Google is to change the crawl rate from the Google Webmaster Tools Console. Setting the crawl rate to “faster” will allow Googlebot to crawl more pages but unfortunately it will also increase the generated traffic on your server. Of course since the maximum allowed crawl rate that you can set is roughly 1 request every 3-4 seconds (actually 0.5 requests per second + 2 seconds pause between requests), this should not cause serious problems for your server.

crawl-rate

5. Use the available tools

The major search engines provide various tools that can help you manage your website. Bing provides you with the Bing Toolbox, Google supports the Google Webmaster Tools and Yahoo offers the Yahoo Site Explorer. In all the above consoles you can manage the indexation settings of your website and your submitted sitemaps. Make sure that you use all of them and that you regularly monitor your websites for warnings and errors. Also resubmit or ping search engine sitemap services when you make a significant amount of changes on your website. A good tool that can help you speed up this pinging process is the Site Submitter, nevertheless it is highly recommended that you use also the official tools of every search engine.

If you follow all the above tips and you still face indexing problems then you should check whether your website isbanned from the search engines, if it is developed with search engine friendly techniques, whether you have enough domain authority to index the particular amount of pages or if you have made a serious SEO mistake (for example block the search engines by using robots.txt or meta-robots etc). A good way to detect such mistakes is to use the Web SEO Analysis tool which provides detailed diagnostics.  Finally most of the major search engines have special groups and forums where you can seek for help, so make sure you visit them and post your questions.

source: webseoanalytics.com

Search Engine Algorithm Basics

A good search engine does not attempt to return the pages that best match the input query. A good search engine tries to answer the underlying question. If you become aware of this you’ll understand why Google (and other search engines), use a complex algorithm to determine what results they should return. The factors in the algorithm consist of “hard factors” as the number of backlinks to a page and perhaps some social recommendations through likes and +1′ s. These are usually external influences. You also have the factors on the page itself. For this the way a page is build and various page elements play a role in the algorithm. But only by analyzing the on-site and off-site factors is it possible for Google to determine which pages will answer is the question behind the query. For this Google will have to analyze the text on a page.

In this article I will elaborate on the problems of a search engine and optional solutions. At the end of this article we haven’t revealed Google’s algorithm (unfortunately), but we’ll be one step closer to understand some advice we often give as an SEO. There will be some formulas, but do not panic. This article isn’t just about those formulas. The article contains a excel file. Oh and the best thing: I will use some Dutch delights to illustrate the problems.

Croquets and Bitterballen
Behold: Croquets are the elongated and bitterballen are the round ones 😉

True OR False
Search engines have evolved tremendously in recent years, but at first they could only deal with Boolean operators. In simple terms, a term was included in a document or not. Something was true or false, 1 or 0. Additionally you could use the operators as AND, OR and NOT to search documents that contain multiple terms or to exclude terms. This sounds fairly simple, but it does have some problems with it. Suppose we have two documents, which consist of the following texts:

Doc1:
“And our restaurant in New York serves croquets and bitterballen.”

Doc2:
“In the Netherlands you retrieve croquets and frikandellen from the wall.”
Frikandellen
Oops, almost forgot to show you the frikandellen 😉

If we were to build a search engine, the first step is tokenization of the text. We want to be able to quickly determine which documents contain a term. This is easier if we all put tokens in a database. A token is any single term in a text, so how many tokens does Doc1 contain?

At the moment you started to answer this question for yourself, you probably thought about the definition of a “term”. Actually, in the example “New York” should be recognized as one term. How we can determine that the two individual words are actually one word is outside the scope of this article, so at the moment we threat each separate word as a separate token. So we have 10 tokens in Doc1 and 11 tokens in Doc2. To avoid duplication of information in our database, we will store types and not the tokens.

Types are the unique tokens in a text. In the example Doc1 contains twice the token “and”. In this example I ignore the fact that “and” appears once with and once without being capitalized. As with the determination of a term, there are techniques to determine whether something actually needs to be capitalized. In this case, we assume that we can store it without a capital and that “And” & “and” are the same type.

By storing all the types in the database with the documents where we can find them, we’re able to search within the database with the help of Booleans. The search “croquets” will result in both Doc1 and Doc2. The search for “croquets AND bitterballen” will only return Doc1 as a result. The problem with this method is that you are likely to get too much or too little results. In addition, it lacks the ability to organize the results. If we want to improve our method we have to determine what we can use other then the presence / absence of a term in a document. Which on-page factors would you use to organize the results if you were Google?

Zone Indexes
A relatively simple method is to use zone indexes. A web page can be divided into different zones. Think of a title, description, author and body. By adding a weight to each zone in a document, we’re able to calculate a simple score for each document. This is one of the first on page methods search engines used to determine the subject of a page. The operation of scores by zone indexes is as follows:

Suppose we add the following weights ​​to each zone:

Zone Weight
title 0.4
description 0.1
content 0.5

We perform the following search query:
“croquets AND bitterballen”

And we have a document with the following zones:

Zone Content Boolean Score
title New York Café 0 0
description Café with delicious croquets and bitterballen 1 0.1
content Our restaurant in New York serves croquets andbitterballen 1 0.5
Total 0.6

Because at some point everyone started abusing the weights assigned to for example the description, it became more important for Google to split the body in different zones and assign a different weight to each individual zone in the body.

This is quite difficult because the web contains a variety of documents with different structures. The interpretation of an XML document by such a machine is quite simple. When interpreting an HTML document it becomes harder for a machine. The structure and tags are much more limited, which makes the analysis more difficult. Of course there will be HTML5 in the near future and Google supports microformats, but it still has its limitations. For example if you know that Google assigns more weight to content within the <content> tag and less to content in the <footer> tag, you’ll never use the <footer> tag.

To determine the context of a page, Google will have to divide a web page into blocks. This way Google can judge which blocks on a page are important and which are not. One of the methods that can be used is the text / code ratio. A block on a page that contains much more text than HTML code contains probably the main content on the page. A block that contains many links / HTML code and little content is probably the menu. This is why choosing the right WYSIWYG editor is very important. Some of these editors use a a lot of unnecessary HTML code.

The use of text / code ratio is just one of the methods which a search engine can use to divide a page into blocks. Bill Slawski talked about identifying blocks earlier this year.

The advantage of the zone indexes method is that you can calculate quite simple a score for each document. A disadvantage of course is that many documents can get the same score.

Term frequency
When I asked you to think of on-page factors you would use to determine relevance of a document, you probably thought about the frequency of the query terms. It is a logical step to increase weight to each document using the search terms more often.

Some SEO agencies stick to the story of using the keywords on a certain percentage in the text. We all know that isn’t true, but let me show you why. I’ll try to explain it on the basis of the following examples. Here are some formulas to emerge, but as I said it is the outline of the story that matters.

The numbers in the table below are the number of occurrences of a word in the document (also called term frequency or tf). So which document has a better score for the query: croquets and bitterballen ?

croquets and café bitterballen Amsterdam
Doc1 8 10 3 2 0
Doc2 1 20 3 9 2
DocN
Query 1 1 0 1 0

The score for both documents would be as follows:
score(“croquets and bitterballen”, Doc1) = 8 + 10 + 2 = 20
score(“croquets and bitterballen”, Doc2) = 1 + 20 + 9 = 30

Document 2 is in this case closer related to the query. In this example the term “and” gains the most weight, but is this fair? It is a stop word, and we like to give it only a little value. We can achieve this by using inverse document frequency (tf-idf), which is the opposite of document frequency (df). Document frequency is the number of documents where a term occurs. Inverse document frequency is, well, the opposite. As the number of documents in which a term grows, idf will shrink.

You can calculate idf by dividing the total number of documents you have in your corpus by the number of documents containing the term and then take the logarithm of that quotient.

Suppose that the IDF of our query terms are as follows:
Idf(croquets)            = 5
Idf(and)                   = 0.01
Idf(bitterballen)         = 2

Then you get the following scores:
score(“croquets and bitterballen”, Doc1) = 8*5  + 10*0.01 + 2*2 = 44.1
score(“croquets and bitterballen”, Doc2) = 1*5 + 20*0.01 + 9*2 = 23.2

Now Doc1 has a better score. But now we don’t take the length into account. One document can contain much more content then another document, without being more relevant. A long document gains a higher score quite easy with this method.

Vector model
We can solve this by looking at the cosine similarity of a document. An exact explanation of the theory behind this method is outside the scope of this article, but you can think about it as an kind of harmonic mean between the query terms in the document. I made an excel file, so you can play with it yourself. There is an explanation in the file itself. You need the following metrics:

  • Query terms – each separate term in the query.
  • Document frequency – how many documents does Google know containing that term?
  • Term frequency – the frequency for each separate query term in the document (add this Focus Keyword widget made by Sander Tamaëla to your bookmarks, very helpful for this part)

Here’s an example where I actually used the model. The website had a page that was designed to rank for “fiets kopen” which is Dutch for “buying bikes”. The problem was that the wrong page (the homepage) was ranking for the query.

For the formula, we include the previously mentioned inverse document frequency (idf). For this we need the total number of documents in the index of Google. For this we assume N = 10.4 billion.

An explanation of the table below:

  • tf = term frequency
  • df = document frequency
  • idf = inverse document frequency
  • Wt,q = weight for term in query
  • Wt,d = weight for term in document
  • Product = Wt,q * Wt,d
  • Score = Sum of the products

The main page, which was ranking: http://www.fietsentoko.nl/

term Query Document Product
tf df idf Wt,q tf Wf Wt,d
Fiets 1 25.500.000 3.610493159 3.610493159 21 441 0.70711 2.55302
Kopen 1 118.000.000 2.945151332 2.9452 21 441 0.70711 2.08258
Score: 4.6356

The page I wanted to rank: http://www.fietsentoko.nl/fietsen/

term Query Document Product
tf df idf Wt,q tf Wf Wt,d
Fiets 1 25.500.000 3.610493159 3.610493159 22 484 0.61782 2.23063
Kopen 1 118.000.000 2.945151332 2.945151332 28 784 0.78631 2.31584
Score: 4.54647

Although the second document contains the query terms more often, the score of the document for the query was lower (higher is better). This was because the lack of balance between the query terms. Following this calculation, I changed the text on the page, and increased the use of the term “fietsen” and decreased the use of “kopen” which is a more generic term in the search engine and has less weight. This changed the score as follows:

term Query Document Product
tf df idf Wt,q tf Wf Wt,d
Fiets 1 25.500.000 3.610493159 3.610493159 28 784 0.78631 2.83897
Kopen 1 118.000.000 2.945151332 2.945151332 22 484 0.61782 1.81960
Score: 4.6586

After a few days, Google crawled the page and the document I changed started to rank for the term. We can conclude that the number of times you use a term is not necessarily important. It is important to find the right balance for the terms you want to rank.

Speed up the process
To perform this calculation for each document that meets the search query, cost a lot of processing power. You can fix this by adding some static values ​​to determine for which documents you want to calculate the score. For example PageRank is a good static value. When you first calculate the score for the pages matching the query and having an high PageRank, you have a good change to find some documents which would end up in the top 10 of the results anyway.

Another possibility is the use of champion lists. For each term take only the top N documents with the best score for that term. If you then have a multi term query, you can intersect those lists to find documents containing all query terms and probably have a high score. Only if there are too few documents containing all terms, you can search in all documents. So you’re not going to rank by only finding the best vector score, you have the have your statics scores right as well.

Relevance feedback
Relevance feedback is assigning more or less value to a term in a query, based on the relevance of a document. Using relevance feedback, a search engine can change the user query without telling the user.

The first step here is to determine whether a document is relevant or not. Although there are search engines where you can specify if a result or a document is relevant or not, Google hasn’t had such a function for a long time. Their first attempt was by adding the favorite star at the search results. Now they are trying it with the Google+ button. If enough people start pushing the button at a certain result, Google will start considering the document relevant for that query.

Another method is to look at the current pages that rank well. These will be considered relevant. The danger of this method is topic drift. If you’re looking for bitterballen and croquettes, and the best ranking pages are all snack bars in Amsterdam, the danger is that you will assign value to Amsterdam and end up with just snack bars in Amsterdam in the results.

Another way for Google is to use is by simply using data mining. They can also look at the CTR of different pages. Pages where the CTR is higher and have a lower bounce rate then average can be considered relevant. Pages with a very high bounce rate will just be irrelevant.

An example of how we can use this data for adjusting the query term weights is Rochio’s feedback formula. It comes down to adjusting the value of each term in the query and possibly adding additional query terms. The formula for this is as follows:
Rochhio feedback formula

The table below is a visual representation of this formula. Suppose we apply the following values ​​:
Query terms: +1 (alpha)
Relevant terms: +1 (beta)
Irrelevant terms: -0.5 (gamma)

We have the following query:
“croquets and bitterballen”

The relevance of the following documents is as follows:
Doc1   : relevant
Doc2   : relevant
Doc3   : not relevant

Terms Q Doc1 Doc2 Doc3 Weight new query
croquets 1 1 1 0 1 + 1 – 0        = 2
and 1 1 0 1 1 + 0.5 – 0.5  = 1
bitterballen 1 0 0 0 1 + 0 – 0         = 1
café 0 0 1 0 0 + 0.5 – 0     = 0.5
Amsterdam 0 0 0 1 0 + 0 – 0.5     = -0.5  = 0

The new query is as follows:
croquets(2) and(1) bitterballen(1) cafe(0.5)

The value for each term is the weight that it gets in your query. We can use those weights in our vector calculations. Although the term Amsterdam was given a score of -0.5, the adjust negative values back to 0. In this way we do not exclude terms from the search results. And although café did not appear in the original query, it was added and was given a weight in the new query.

Suppose Google uses this way of relevance feedback, then you could look at pages that already rank for a particular query. By using the same vocabulary, you can ensure that you get the most out of this way of relevance feedback.

Takeaways
In short, we’ve considered one of the options for assigning a value to a document based on the content of the page. Although the vector method is fairly accurate, it is certainly not the only method to calculate relevance. There are many adjustments to the model and it also remains only a part of the complete algorithm of search engines like Google. We have taken a look into relevance feedback as well. *cough* panda *cough*. I hope I’ve given you some insights in the methods search engine can use other then external factors. Now it’s time to discuss this and to go play with the excel file 🙂

Have a good day!!!

source: http://www.seomoz.org

The 4 Critically Essential Off-The-Page Search Engine Optimization Factors

In our last lesson we talked about the things you can do on your website to help it rank well in the search engines — in other words, the “on the page” factors. In this lesson we’re going to talk about the external factors that can influence your rankings — the “off the page” factors.

Your Google PageRank

Before we get into the “hows”, it’s important that you understand a little bit about Google’s PageRank. PageRank is Google’s way of indexing all content and websites based on importance in the internet community. It’s an important factor in Google’s ranking algorithm, and by understanding a little of how it works, you’ll have a better idea about how to boost your rankings in the world’s most popular search engine.

To establish the “importance” of your page, Google looks at how many other websites are linking to your page. These links are like “votes”, and the more “votes” you have, the greater your online “importance” and the higher your PageRank.

And higher PageRank is an important contributor to higher search engine rankings.

It’s not as democratic as it sounds, however: Not every page that links to you is given equal “voting power”. Pages that have a high PageRank have more voting power than pages with low PageRank. This means that the “boost” a link gives to your own PageRank is closely related to the PageRank of the site that’s linking to you.

For instance… receiving just ONE link from a PR5 page might well give you more benefit than receiving 20 links from PR0 pages. It’s quality not quantity that’s important.

The equation for working out how much PR value you’ll get from a link looks something like this:

  • PR = 0.15 + 0.85 x (your share of the link PR)
  • By “your share of the link PR” I mean that every site only has a certain amount of PR “juice” to give out. Let’s say a page has 100 votes. Lets say it has 20 outgoing links on that page. Then each link is sending 5 votes to the other site. (100 / 20 = 5) That is a simple way of looking at the share of the PR of the link. In reality the higher-placed links get higher voting power, (e.g. 10 votes each) while the lower-placed ones will get less, (e.g. 2 votes each).

There are many other factors at play that determine the PageRank of a page:

  1. The amount of PageRank flowing in to your page. PageRank can come from other sites linking to your page, but also from other pages on your website linking to your page.
  2. Your internal linking: As I just mentioned, PageRank can also come from other pages on your website, trickling from one page to another through your internal linking, menus and such. The trick is to “sculpt” the flow of your PageRank so that it “pools” in your most important pages. (In other words, don’t waste your PageRank by linking to your “contact us” page and site-map all over the show… add rel=”nofollow” to those links to stop the PageRank leaking through to them.)
  3. The number of pages in your website: the more pages your website has, the higher your PageRank will be.
  4. The number of external sites you link to. Again, think of PageRank as being something that “flows”. By linking to lots of other websites you’re letting your PageRank flow out of your page, rather than allowing it to pool. Try to have reciprocal links wherever possible, so that the PageRank flows back to you.

The best piece of advice is to keep these points in mind when building your site and try to avoid any on-page factors which might be detrimental to the “flow” of your PageRank through your site. Once you’ve done that, work on getting quality links from quality websites. The easiest way to do this is to fill your website with useful, relevant information that makes people want to link to you!

And remember: PageRank is just part of Google’s ranking algorithm. You’ll often see pages with high PageRank being outranked by pages with lower PageRank, which shows that there’s much more at play here!

#1: Build lots of 1-way incoming links

Do this through article submissions, directory submissions, submitting articles to blog networks (such as the PLRPro blog network), buying links (e.g. from digital point forums), and so on.

But be careful…

Purchased links can sometimes be more powerful than links you get by more natural methods… but Google will penalize you if they know that you are buying links. One way they’ll nab you is if you buy a link on a monthly lease and then end up canceling it. One link might not be enough to send up the red flags, but some people buy and cancel hundreds of links in this manner.

A better idea is to buy lifetime links from places like forums.digitalpoint.com, and to try to find links from websites that are on topics relevant to your own.

#2: Get some links from good sites

By “good sites” I mean websites that have a high PageRank, or sites with a high “trust” factor (such as Yahoo, Dmoz or sites with a .edu suffix). If you can get good links to the pages on your site that generate the most income for you, even better — if you can improve the ranking of these pages you’ll get more traffic, more conversions, and more money!

#3: Make sure that pages you gain links from are, in fact, indexed.

A link to your site won’t count for anything if the page that is linking to you hasn’t actually been indexed by the search engines. The search engines won’t see the link, and they won’t give you any credit for it. I see a lot of people submitting their sites to article directories and search directories, and then ending up on a page that the search engines don’t visit. This is pointless!

The good news is that it’s pretty simple to get all these pages indexed. All you have to do is let the search engines know about the page yourself. To do this you need to set up a webpage outside of your main site, such as a free blog or a Twitter.com profile. Make sure that the search engines are indexing this page, of course, and then every time you get a new link to your main site, write about it in your blog or Twitter profile! The search engines will see this and visit the other site — hey presto! The page is now indexed, and you’ll get credit for your link.

Important: Don’t link to this blog or Twitter profile from your main money website. Doing this will create a reciprocal link loop…

#4: Don’t loop your links

Reciprocal links aren’t as powerful as one-way links. This is why you want to receive one-way links from other websites wherever possible.

But there are also things called “reciprocal link loops” which are like bigger versions of this. I mentioned one in the last tip… A links to B, B links to C and C links to A. That’s a loop… it eventually comes full circle back to the first site. A “link loop” can get pretty large, but if it eventually ends up back at the start, it’s still a loop, and all links within the loop become less powerful. Small loops are the worst, but try to avoid loops wherever possible.

That brings us to the end of our critical off-page factors for search engine optimization. In part three of this five-part mini-course I’ll talk link building strategies: Keep an eye out for it!

How Google’s Panda Update Changed SEO Best Practices Forever

It’s here! Google has released Panda update 2.2, just as Matt Cutts said they would at SMX Advanced here in Seattle a couple of weeks ago. This time around, Google has – among other things – improved their ability to detect scraper sites and banish them from the SERPs. Of course, the Panda updates are changes to Google’s algorithm and are not merely manual reviews of sites in the index, so there is room for error (causing devastation for many legitimate webmasters and SEOs).

A lot of people ask what parts of their existing SEO practice they can modify and emphasize to recover from the blow, but alas, it’s not that simple. In this week’s Whiteboard Friday, Rand discusses how the Panda updates work and, more importantly, how Panda has fundamentally changed the best practices for SEO. Have you been Panda-abused? Do you have any tips for recuperating? Let us know in the comments!

Panda, also known as Farmer, was this update that Google came out with in March of this year, of 2011, that rejiggered a bunch of search results and pushed a lot of websites down in the rankings, pushed some websites up in the rankings, and people have been concerned about it ever since. It has actually had several updates and new versions of that implementation and algorithm come out. A lot of people have all these questions like, “Ah, what’s going on around Panda?” There have been some great blog posts on SEOmoz talking about some of the technical aspects. But I want to discuss in this Whiteboard Friday some of the philosophical and theoretical aspects and how Google Panda really changes the way a lot of us need to approach SEO.

So let’s start with a little bit of Panda history. Google employs an engineer named Navneet Panda. The guy has done some awesome work. In fact, he was part of a patent application that Bill Slawski looked into where he found a great way to scale some machine learning algorithms. Now, machine learning algorithms, as you might be aware, are very computationally expensive and they take a long time to run, particularly if you have extremely large data sets, both of inputs and of outputs. If you want, you can research machine learning. It is an interesting fun tactic that computer scientists use and programmers use to find solutions to problems. But basically before Panda, machine learning scalability at Google was at level X, and after it was at the much higher level Y. So that was quite nice. Thanks to Navneet, right now they can scale up this machine learning.

What Google can do based on that is take a bunch of sites that people like more and a bunch of sites that people like less, and when I say like, what I mean is essentially what the quality raters, Google’s quality raters, tell them this site is very enjoyable. This is a good site. I’d like to see this high in the search results. Versus things where the quality raters say, “I don’t like to see this.” Google can say, “Hey, you know what? We can take the intelligence of this quality rating panel and scale it using this machine learning process.”

Here’s how it works. Basically, the idea is that the quality raters tell Googlers what they like. They answer all these questions, and you can see Amit Singhal and Matt Cutts were interviewed by Wired Magazine. They talked about some of the things that were asked of these quality raters, like, “Would you trust this site with your credit card? Would you trust the medical information that this site gives you with your children? Do you think the design of this site is good?” All sorts of questions around the site’s trustworthiness, credibility, quality, how much they would like to see it in the search results. Then they compare the difference.

The sites that people like more, they put in one group. The sites that people like less, they put in another group. Then they look at tons of metrics. All these different metrics, numbers, signals, all sorts of search signals that many SEOs suspect come from user and usage data metrics, which Google has not historically used as heavily. But they think that they use those in a machine learning process to essentially separate the wheat from the chaff. Find the ones that people like more and the ones that people like less. Downgrade the ones they like less. Upgrade the ones they like more. Bingo, you have the Panda update.

So, Panda kind of means something new and different for SEO. As SEOs, for a long time you’ve been doing the same kind of classic things. You’ve been building good content, making it accessible to search engines, doing good keyword research, putting those keywords in there, and then trying to get some links to it. But you have not, as SEOs, we never really had to think as much or as broadly about, “What is the experience of this website? Is it creating a brand that people are going to love and share and reward and trust?” Now we kind of have to think about that.

It is almost like the job of SEO has been upgraded from SEO to web strategist. Virtually everything you do on the Internet with your website can impact SEO today. That is especially true following Panda. The things that they are measuring is not, oh, these sites have better links than these sites. Some of these sites, in fact, have much better links than these sites. Some of these sites have what you and I might regard, as SEOs, as better content, more unique, robust, quality content, and yet, people, quality raters in particular, like them less or the things, the signals that predict that quality raters like those sites less are present in those types of sites.

Let’s talk about a few of the specific things that we can be doing as SEOs to help with this new sort of SEO, this broader web content/web strategy portion of SEO.

First off, design and user experience. I know, good SEOs have been preaching design user experience for years because it tends to generate more links, people contribute more content to it, it gets more social signal shares and tweets and all this other sort of good second order effect. Now, it has a first order effect impact, a primary impact. If you can make your design absolutely beautiful, versus something like this where content is buffeted by advertising and you have to click next, next, next a lot. The content isn’t all in one page. You cannot view it in that single page format. Boy, the content blocks themselves aren’t that fun to read, even if it is not advertising that’s surrounding them, even if it is just internal messaging or the graphics don’t look very good. The site design feels like it was way back in the 1990s. All that stuff will impact the ability of this page, this site to perform. And don’t forget, Google has actually said publicly that even if you have a great site, if you have a bunch of pages that are low quality on that site, they can drag down the rankings of the rest of the site. So you should try and block those for us or take them down. Wow. Crazy, right? That’s what a machine learning algorithm, like Panda, will do. It will predicatively say, “Hey, you know what? We’re seeing these features here, these elements, push this guy down.”

Content quality matters a lot. So a lot of time, in the SEO world, people will say, “Well, you have to have good, unique, useful content.” Not enough. Sorry. It’s just not enough. There are too many people making too much amazing stuff on the Internet for good and unique and grammatically correct and spelled properly and describes the topic adequately to be enough when it comes to content. If you say, “Oh, I have 50,000 pages about 50,000 different motorcycle parts and I am just going to go to Mechanical Turk or I am going to go outsource, and I want a 100 word, two paragraphs about each one of them, just describe what this part is.” You think to yourself, “Hey, I have good unique content.” No, you have content that is going to be penalized by Panda. That is exactly what Panda is designed to do. It is designed to say this is content that someone wrote for SEO purposes just to have good unique content on the page, not content that makes everyone who sees it want to share it and say wow. Right?

If I get to a page about a motorcycle part and I am like, “God, not only is this well written, it’s kind of funny. It’s humorous. It includes some anecdotes. It’s got some history of this part. It has great photos. Man, I don’t care at all about motorcycle parts, and yet, this is just a darn good page. What a great page. If I were interested, I’d be tweeting about this, I’d share it. I’d send it to my uncle who buys motorcycles. I would love this page.” That’s what you have to optimize for. It is a totally different thing than optimizing for did I use the keyword at least three times? Did I put it in the title tag? Is it included in there? Is the rest of the content relevant to the keywords? Panda changes this. Changes it quite a bit. 😉

Finally, you are going to be optimizing around user and usage metrics. Things like, when people come to your site, generally speaking compared to other sites in your niche or ranking for your keywords, do they spend a good amount of time on your site, or do they go away immediately? Do they spend a good amount of time? Are they bouncing or are they browsing? If you have a good browse rate, people are browsing 2, 3, 4 pages on average on a content site, that’s decent. That’s pretty good. If they’re browsing 1.5 pages on some sites, like maybe specific kinds of news sites, that might actually be pretty good. That might be better than average. But if they are browsing like 1.001 pages, like virtually no one clicks on a second page, that might be weird. That might hurt you. Your click-through rate from the search results. When people see your title and your snippet and your domain name, and they go, “Ew, I don’t know if I want to get myself involved in that. They’ve got like three hyphens in their domain name, and it looks totally spammy. I’m not going to get involved.” Then that click-through rate is probably going to suffer and so are your rankings.

They are going to be looking at things like the diversity and quantity of traffic that comes to your site. Do lots of people from all around the world or all around your local region, your country, visit your website directly? They can measure this through Chrome. They can measure it through Android. They can measure it through the Google toolbar. They have all this user and usage metrics. They know where people are going on the Internet, where they spend time, how much time they spend, and what they do on those pages. They know about what happens from the search results too. Do people click from a result and then go right back to the search results and perform another search? Clearly, they were unhappy with that. They can take all these metrics and put them into the machine learning algorithm and then have Panda essentially recalculate. This why you see essentially Google doesn’t issue updates every day or every week. It is about every 30 or 40 days that a new Panda update will come out because they are rejiggering all this stuff. 🙂

One of the things that people who get hit by Panda come up to me and say, “God, how are we ever going to get out of Panda? We’ve made all these changes. We haven’t gotten out yet.” I’m like, “Well, first off, you’re not going to get out of it until they rejigger the results, and then there is no way that you are going to get out of it unless you change the metrics around your site.” So if you go into your Analytics and you see that people are not spending longer on your pages, they are not enjoying them more, they are not sharing them more, they are not naturally linking to them more, your branded search traffic is not up, your direct type in traffic is not up, you see that none of these metrics are going up and yet you think you have somehow fixed the problems that Panda tries to solve for, you probably haven’t.

I know this is frustrating. I know it’s a tough issue. In fact, I think that there are sites that have been really unfairly hit. That sucks and they shouldn’t be and Google needs to work on this. But I also know that I don’t think Google is going to be making many changes. I think they are very happy with the way that Panda has gone from a search quality perspective and from a user happiness perspective. Their searchers are happier, and they are not seeing as much junk in the results. Google likes the way this is going. I think we are going to see more and more of this over time. It could even get more aggressive. I would urge you to work on this stuff, to optimize around these things, and to be ready for this new form of SEO. 🙂

Google Panda 3.1 Update : 11/18

Friday afternoon, sometime after 4pm I believe, Google tweeted that they pushed out a “minor” Panda update effecting less than one-percent of all searches.

The last time Google said a Panda update was minor, it turned out to be pretty significant.

That being said, we should have named it 3.0 – in fact, I spoke to someone at Google who felt the same. So I am going to name this one 3.1, although it does make it easier to reference these updates by dates.

Panda Updates:

 

For more on Panda, see our Google Panda category.

Forum discussion at WebmasterWorld.

Best SEO Blogs: Top 10 Sources to Stay Up-to-Date

Like many overly-connected web junkies, I find myself increasingly overwhelmed by information, resources and news. Sorting the signal from the noise is essential to staying sane, but missing an important development can be costly. To balance this conflict, I’ve recently re-arranged my daily reading habits (which I’ve written about several times before) and my Firefox sidebar (a critical feature that keeps me from switching to Chrome).

I’ll start by sharing my top 10 sources in the field of search & SEO, then give you a full link list for those interested in seeing all the resources I use. I’ve whittled the list down to just ten to help maximize value while minimizing time expended (in my less busy days, I’d read 4-5 dozen blogs daily and even more than that each week).

Top 10 Search / SEO Blogs

#1 – Search Engine Land

Best SEO Blogs - SearchEngineLand

  • Why I Read It: For several years now, SELand has been the fastest, most accurate and well-written news source in the world of search. The news pieces in particular provide deep, useful, interesting coverage of their subjects, and though some of the columns on tactics/strategies are not as high quality, a few are still worth a read. Overall, SELand is the best place to keep up with the overall search/technology industry, and that’s important to anyone operating a business in the field.
  • Focus: Search industry and search engine news
  • Update Frequency: Multiple times daily

#2 – SEOmoz

SEOmoz Blog

  • Why I Read It: Obviously, it’s hard not to be biased, but removing the personal interest, the SEOmoz Blog is still my favorite source for tactical & strategic advice, as well as “how-to” content. I’m personally responsible for 1 out of every 4-6 articles, but the other 75%+ almost always give me insight into something new. The comments are also, IMO, often as good or better than the posts – the moz community attracts a lot of talented, open, sharing professionals and that keeps me reading daily.
  • Focus: SEO & web marketing tactics & strategies
  • Update Frequency: 1-2 posts per weekday

#3 – SEOBook

SEOBook Blog

  • Why I Read It: The SEOBook blog occassionally offers some highly useful advice or new tactics, but recently, most of the commentary focuses on the shifting trends in the SEO industry, along with a healthy dose of engine and establishment-critical editorials. These are often quite instructive on their own, and I think more than a few have had substantive impact on changing the direction of players big and small.
  • Focus: Inudstry trends as they relate to SEO; Editorials on abuse & manipulation
  • Update Frequency: 1-3X per week

#4 – Search Engine Roundtable

SERoundtable Blog

  • Why I Read It: Barry Schwartz has long maintained this bastion of recaps, roundups and highlights from search-related discussions and forums across the web. The topics are varied, but usually useful and interesting enough to warrant at least a daily browse or two.
  • Focus: Roundup of forum topics, industry news, SEO discussions
  • Update Frequency: 3-4X Daily

#5 – Search Engine Journal

SEJournal Blog

  • Why I Read It: The Journal strikes a nice balance between tactical/strategic articles and industry coverage, and anything SELand misses is often here quite quickly. They also do some nice roundups of tools and resources, which I find useful from an analysis & competitive research perspective.
  • Focus: Indsutry News, Tactics, Tools & Resources
  • Update Frequency: 2-3X Daily

#6 – Conversation Marketing

Conversation Marketing

  • Why I Read It: I think Ian Lurie might be the fastest rising on my list. His blog has gone from ocassionally interesting to nearly indispensable over the last 18 months, as the quality of content, focus on smart web/SEO strategies and witty humor shine through. As far as advice/strategy blogs go in the web marketing field, his is one of my favorites for consistently great quality.
  • Focus: Strategic advice, how-to articles and the occassional humorous rant
  • Update Frequency: 2-4X weekly

#7 – SEO By the Sea

SEO by the Sea

  • Why I Read It: Bill Slawski takes a unique approach to the SEO field, covering patent applications, IR papers, algorithmic search technology and other technically interesting and often useful topics. There’s probably no better analysis source out there for this niche, and Bill’s work will often inspire posts here on SEOmoz (e.g. 17 Ways Search Engines Judge the Value of a Link).
  • Focus: IR papers, patents and search technology
  • Update Frequency: 1-3X per week

#8 – Blogstorm

Blogstorm

  • Why I Read It: Although Blogstorm doesn’t update as frequently as some of the others, neraly every post is excellent. In the last 6 months, I’ve been seriously impressed by the uniqueness of the material covered and the insight shown by the writers (mostly Patrick Altoft with occassional other contributors). One of my favorites, for example, was their update to some of the AOL CTR data, which I didn’t see well covered elsewhere.
  • Focus: SEO insider analysis, strategies and research coverage
  • Update Frequency: 3-5X monthly

#9 – Dave Naylor

David Naylor

  • Why I Read It: Dave’s depth of knowledge is legendary and unlike many successful business owners in the field, he’s personally kept himself deeply aware of and involved in SEO campaigns. This acute attention to the goings-on of the search rankings have made his articles priceless (even if the grammar/spelling isn’t always stellar). The staff, who write 50%+ of the content these days, are also impressively knowledgable and maintain a good level of discourse and disclosure.
  • Focus: Organic search rankings analysis and macro-industry trends
  • Update Frequency: 1-3X weekly

#10 – Marketing Pilgrim

Marketing Pilgrim

  • Why I Read It: A good mix of writers cover the search industry news and some tactical/strategic subjects as well. The writing style is compelling and it’s great to get an alternative perspective. I’ve also noticed that MP will sometimes find a news item that other sites miss and I really appreciate the feeling of comprehensiveness that comes from following them + SELand & SERoundtable.
  • Focus: Industry news, tactical advice and a bit of reputation/social management
  • Update Frequency: 2-3X daily

Other sites that I’ll read regularly (who only barely missed my top 10) include DistilledYOUmoz,PerformableChris Brogan, the Webmaster Central BlogEric EngeAvinash KaushikSEWatchGil Reich& the eMarketer blog. I also highly recommend skimming through SEO Alltop, as it lets me quickly review anything from the longer tail of SEO sites.


The rest of my Firefox sidebar is listed below, sorted into sections according to the folders I use. Note that because I’ve got the SEOmoz toolbar (mozBar), I use that to access all the moz SEO tools rather than replicating them in my sidebar. I’ve also been able to ditch my large collection of bookmarklets thanks to the mozBar, but if you prefer to keep them, here’s a great set of 30 SEO Bookmarklets (all draggable).

UPDATE: TopRank just published a list of the most subscribed-to blogs in the SEO field that can also be a great resource for those interested. 🙂

source: http://www.seomoz.org

How to Establish QUALITY Backlinks?

When starting your own website or blog, acquiring a good search engine ranking position (SERP) is usually one of the main goals. Achieving a high SERP for relevant keywords is essential for the exposure you need to drive traffic and visitors to your content. When you see the number of online competitors, it is enough to overwhelm even the most determined webmaster. However, if you plan things out properly, you can overcome even stiff competition with a bit of effort. You can even do this with fewer posts and a much younger site.

The way to reach the top spots in searches is through properly targeted quality content and backlinks. You correctly target your posts and backlinks based on your niche’s relevant keywords. Relevant keywords is a short way of saying the words and phrases that people will naturally type into search engines when they are looking for the kind of content, products, and/or services that you offer.

Quality content is an important key. Your posts, articles, videos, and other content must appeal to human readers. It is easy to adjust well-written features to include relevant keywords for the search engines to see. However, you will only drive away human readers if you make poorly written blogs, spammy sales text, and other “junk” content. It is important to make sure things are tuned properly for search engines, but do not forget it is human beings that open their minds and wallets.

Building good backlinks to quality content is one of the most effective and surefire ways to drive your site to the top of the SERPs. Good backlinks use your relevant keywords as the anchor text. Anchor text is simply a web link that is wrapped around a word or set of words, just link a website menu link. 🙂

Knowing where to build your backlinks is a key element. You should place your links naturally in places that have many visitors and/or high search engine trust. Trust is usually measured by PageRank, which is a measure of website authority by Google. Alexa Rank is the normal tool to measure visitor volume.

Blog backlinks are one of the most common and powerful tools. Bloggers form a large network and community. You can include your website and one or two backlinks with a blog comment. Auto-approve blogs are considered highly desirable, because comments are not screened and do not wait on a human being to approve them. Dofollow relevant blogs and forums are also greatly sought after because backlinks from them count the most, as the search engines see the referring site as explicitly endorsing the links.

When leaving blog backlinks, be sure to include substantive comments. Just spamming your links everywhere will have you targeted by spam detection programs and leave a wake of annoyed blog owners. When using auto approve blogs, be sure to check the other links present. You do not want your site featured in a “bad neighborhood” with male enhancement and gambling spammers. Auto-approve blogs make for easy and good links, if the comments are not overfull and the other links look good.

It can seem impossible when first starting out to reach the top of the search engine rankings. However, you can dominate the search results with quality content and good backlinks. Blog backlinks are among the most popular options and auto-approve blogs provide a major venue to promote your site.

Cheers !!! 🙂

Easy Ways To Optimize Your Site For SEO- Quick Overview

Are you losing your mind and pulling your hair out trying to crack the code of search engine optimization? Look, I have been there so let me enlighten you on a few easy techniques I have picked up along the way to help you get more traffic and make more sales through your websites and web 2.0 properties.

Fist off there is no magic formula to SEO no matter what guru says so nor is it a complicated process expressed by so-called SEO experts. Basically it comes down to common sense on how things work on the technical side of things online.

First off is keywords. If you are driving traffic to your websites via keywords you ought to know how to utilize them to the max for best case scenario. Using the keyword in your domain, description, title, and tags is very important but do not over use them in the body of text. One time in bold is enough and have your main keyword supported by other keywords in your text but most importantly write naturally using slang and proper language associated with your topic. There are SEO tools to help point this out if you are using a blogging platform in case your eyes miss it. 🙂

Next is competition. can you compete against a giant keyword term? You could but it will take a very long time to catch up, so best stick with things you can compete with. Usually staying under 100,000 competing pages is key to ranking on the first page. Look at your competition and take note of their domain age, page rank, and backlinks. This will determine what it will take and how long it will take to out rank them.

And then there is backlinks. This is a very important step to insuring you rank in the top 10 or even #1! It is not how many backlinks you have but where they are coming from. An example: 25 links from PR5 to PR9 sites will outrank 10,000 links from junk low page rank sites, period. I should also point out that these links should be relevant to your niche. If you have a site on “SEO” then most of your links coming in should be from SEO type sites and not “basket weaving” for instance. This way you can easily become untouchable in the rankings.

Good luck!!! 🙂