Jordan Scrapes Amazon For Private Label Products (work in progress)

So. I have some code ready here but it’s not actually “ready”. This post is going to be some discussion about my goals and some of the problems I’m facing with reaching those goals. Again, the code, as of this date, is not ready. It works but it doesn’t have a readme.md.

Once upon a time I wanted to sell private label products. I spent hours and hours and hours trying to find them. I did eventually find one and I had a lot of success with it for probably two years before eventually I had enough quality issues that I decided to stop selling it. I’ve thought a lot about trying to find another product but I honestly dreaded the search again. I didn’t have a lot of faith that I could find one.

Gif of being anxious about
Real anxiety about finding another private label product.

Private label is more about the results page than the product

As I’ve thought about this more I’ve realized that I what I really want this to accomplish is return a bunch of results pages. You know, so you know what you are competing against. With a private label product, you ideally want to be able to reach that front page. You want to know that there is volume there, that the competition isn’t too much, and that the product is has a good price.

Results page.

No Competition with more than 500 reviews

If you have the product you are looking at has direct competition with more than 500 reviews, I think there better be some serious differentiators to help you keep pursing this product. This is probably the biggest deal. You want there to be enough volume for you to enter the market but you don’t want it to be dominated by other people.

So how do I get here? Well, what I would do when I was looking for products was start at the Amazon best sellers click on a category and look for something exciting to sell. The main problem with starting there is that I know all of these products are probably going to be too competitive for me to sell so I’d start to head down the rabbit hole.

The rabbit hole begins.

I then go to the “Selected Items”. I’ll check some of those and generally one level deep is still too competitive so I’ll check the “Selected Items” (or any other recommended items) and keep going until I get inspired. It’s normally a couple levels deep from the best sellers.

Once I find a product, though, I need to see how total sales volume of this product is and how the competition is looking. So I’ll search for it. In the case of the image above I’d probably search amazon for whatever I think the keyword my customers would search for, “mattress cover” or something like that. That results page is what I want this scraper to find.

Problems

I built the scraper to mimic exactly what I do. It starts at the best sellers (I’ve hand selected a few), gets the urls for those products’ details page. Goes to those details page and starts finding urls from the “Selected Items” sections.

Now for the hard part. How do I get a results page from a details page? How does my scraper know what my customer would search for? I don’t know yet. What I’m doing right now is taking the product title, removing the brand name from it and searching that. It’s not working great so far. I end up with things like this:

Results pages look like this.

Very specific results page that doesn’t really help me decide anything. I still need a way to filter it down better.

Possible solutions

What if I just truncate the search term? Like only use the first 20 characters. It might remove some specificity but I might also end up with some real crap pages that don’t represent the real original product at all.

I’ve thought about looking for prepositions (with, at, on, over) and removing everything after (and including) the preposition. The above image is from “40” Portable Folding Easy-Clean Camping Table with Faucet and Dual Water Basins”. So with this method I’d be left with “40” Portable Folding Easy-Clean Camping Table”. That gives me something like this:

Removing after prepositions.

It looks better. It may be an okay solution. I think it’s important to keep in mind that I’m looking for perfection here. Just some solid leads. Something to help inspire for some good product ideas.

Anyway, I’m going to keep tinkering with it this week. I hope to have something ready for next week. I’m still having some memory leakage with my script and it keeps running out of memory on my droplet. I’m sure it’s a problem with my script and not the droplet.

Digital Ocean is really awesome for this sort of thing and I’m still really loving using it.

Leave a Reply

Your email address will not be published. Required fields are marked *