Jordan Puts the Dead Link Checker in an Electron App

Sample code here

App download here

Making it non-coder friendly

The goal with this project was to make the link checker be something that anyone could easily use. Electronjs is a really easy way to do this

Electronjs is getting popular enough that at this point you can search for almost any boiler plate you want to use and it will be available. I am very comfortable with Angular so I ended up using Maxime Gris’ package and it was super well set up.

Code changes

I really didn’t have to do many code changes to get this running. The main one was to copy index.ts into an index-local.ts file that is run with npm start from the link checker script.

The index.ts I just turned into a function that accepts the parameters like export async function deadLinkChecker(desiredDomain: string, desiredIOThreads: any, links: any[]) { . This way I can import the deadLinkChecker function into my electron app after it’s installed as an npm dependency.

Honestly, besides that the rest of the code changes are addressed in the not insignificant amount of problems listed below.

Problems; I got ’em

I would say at this point this project isn’t finished yet. Over the next week I’m going to work on ironing out the few (but pretty big) problems that still exist.

maxSockets ignored on electron?

response.request.agent

The first (and probably biggest) problem is that it appears to be ignoring the maxSockets number set when running it from Electron. Although I’ve logged out the request agent and it has the correct number listed in the maxSockets the time returned doesn’t match what I would expect the time to be.

For example, if I set maxSockets to 1, it takes ~35 seconds to complete when I run the script manually. If I run it through Electron, it takes between 17 and 21 seconds. This is the same time it takes when I run it with 4 maxSockets manually or from Electron. The time to complete the job takes the same amount regardless of the maxSockets I set.

This is a problem because if it’s being ignored, it’s just hitting it as fast as it can, which really could act like a DDoS attack. This app is not meant to be a weapon.

Number of maxSockets being cached

4 submitted, only 2 used. First submission I sent 2.

In the electron app I have a field where a user can set the number of I/O threads that they want the app to use. The idea is that the more threads used the faster the scraper will go. If I set two threads, it should (see problem above with the number of sockets being ignored) make two requests concurrently and then continue as those two complete.

If I run it a second time and set it to four, my logs show that four is being received but the request agent maxSockets still shows that 2 are being used. I believe this has to do with the fact that the script is still living because the electron app is still living. I’ve tried a few things, like globalAgent.destroy() without any luck. If I do a reload on the app, it will reset and use the (first) number passed in correctly.

Difference in request time in electron vs running it manually

When I built this script, I was setting 10 seconds as my request timeout ceiling. This didn’t seem to be a problem and made sense to me. If a request took longer than 10 seconds to complete, that was a lot longer than any reasonable website should take to complete.

Using this same timeout on the electron app gave a bunch of 999 status (statuses?). If I bumped the timeout to 100 seconds the requests would resolve but it would take a LOT longer. So…that means that the requests are literally taking longer. Why would it be any different between the electron app and running it manually?

Should I allow adjustment of maxSockets?

As I write this article, I’m trying to decide if I should even have a field for maxSockets that the user can adjust. If I’m trying to limit it so that a user can’t DDoS then I don’t think this is the way to do it. A non engineer using this would probably not understand the danger and would just want more speed and would try to set it as high as possible.

This brings up the point of whether I should try to limit it at all. Should I just let it run as fast as possible and not have the field available? It would certainly fix my above two problems. This could potentially be dangerous but I believe there are other tools that do similar things.

I think in the end, I will settle on removing the field option and just set the maxSockets to always be four. Four shouldn’t be enough to DDoS any normal site. If someone really wanted to DDoS a site, there are better tools than this running at four maxSockets. This solves my above caching problem but I’m still curious why it’s happening and so I will probably dig further into that still.

How do I stop the checker?

I really am not sure. To be continued. I’ll do some more research over the next week and see what I can do to stop it. I’ll probably need to set up an API in the link checker and then maybe leverage web workers.

Sample code here´╗┐

Leave a Reply

Your email address will not be published. Required fields are marked *