Skip to main content

Posts

Showing posts from August, 2024

Dataset Analysis and Sorting Speedup

In this article, I discuss the analysis and sorting of data while saving computational resources. Problem It is extremely rare to need to find the median or another percentile of a large set of unordered numbers, especially when you do not know the minimum, maximum, or the distribution of values in the set. However, when such a situation arises, it can cause a lot of headaches since obtaining an element by index requires sorting the entire set. The situation worsens when there is a lot of data and not enough free memory. The time spent on sorting can also play a significant role. Simple Solution What can be done to retrieve any element from a sorted set without sorting the entire set? You can analyze the data to determine approximately where the element with the desired index is located, thus simplifying the task of finding the needed element. Typically, such an analysis involves dividing the overall range of numbers into an appropriate number of ranges, reading the...

How to create a Simple Captcha Resolver

"All aircraft designers once learned to make paper airplanes." Someone Smart This article shows my journey of developing a simple captcha resolver. By repeating the steps described, you will be able to get acquainted with the technologies and create your own recognizer for experimentation and study. Some specific development points have been omitted to simplify the material in the article. It should be noted that I am not an ML specialist and very superficially familiar with machine learning technologies. I developed this captcha recognizer out of curiosity, driven by an interest in understanding how the learning process works. So I would be very grateful for advice and comments from those who understand this. There are numerous parameters mentioned in the article that I don’t explain in detail. The goal of this approach is to make the article shorter and simpler while encouraging readers to experiment with the parameters themselves. I have used the simplest algorithms po...