Beiträge von xwe10

xwe10 · 23. März 2023

Just use "ThreadPoolExecutor" with a Context Manager, see here: https://docs.python.org/3/library/concurrent.futures.html

Quote:

Code

If max_workers is None or not given, it will default to the number of processors on the machine, multiplied by 5, assuming that https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor.

Is it a problem when i create threads inside a thread

Like i have a site for example and i want to scrape multiple querys at once and create a thread for each query?

Is the ThreadPoolExecutor better than the ThreadPool?

xwe10 · 23. März 2023

Zitat von m_ueberall

Just a thought: The main difference between your laptop and your VPS is that the available host CPU cores are yours w.r.t. the former, while they're shared among all users w.r.t. the latter (and are very likely overbooked as well). I would expect a somewhat better behaviour using an RS ("root server") where the host CPU cores are allocated differently.

I think I found the problem i'am creating to many threads of one type of site monitor.

What is the maximum amount of threads that i schould use for i/o bounded work?

xwe10 · 22. März 2023

https://drive.google.com/file/d/1uvLP1gUAVo2NWnWj8Sbv5CnTWyQnYiLi/view?usp=sharing

Here is a video of running the "htop" command.

xwe10 · 22. März 2023

Zitat von H6G

How do you print out your statistics / mesure time?

The default print call in python is not a thread-safe operation.

Do you use any synchronisation mechanisms?

I'am using the normal print function from each thread, didnt know that its not thread-safe.

I'am also using the logger libary to create log files where I also print the statistics, do you think this could also be a problem?

But why do it work on my Laptop and not on the Server this wouldnt make any sense no?

Code

import logging
def create(name, level=logging.DEBUG):
    """Factory function to create a logger"""
handler = logging.FileHandler("logs/"+name+".log", mode="w")        
    handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(message)s'))
    handler.setLevel(level)
logger = logging.getLogger(name)
    logger.setLevel(level)
    logger.addHandler(handler)
return logger

xwe10 · 22. März 2023

Zitat von TBT

Could it be that you just "fire" too quickly and the queue / backlog gets stuck at some point? Is it really necessary for the application to scrape that often? Maybe your IP runs into some external limiter of the website you are scraping from?

If you want to use that as "sneaker sniper", a rate of 1 scrape every second or all 5 seconds should also more than enough.

Iam requesting every 15 seconds and iam also using proxys, i dont thing the website is the problem.

This problem occur at all websites that I monitor.

xwe10 · 22. März 2023

Thank you for your help.

Laptop:

Docker Version: Docker version 20.10.22, build 3a2c30b

Python Version: Python 3.10 (Same version because of the Docker Image)

Prozessor:

Gerätename Lenovo

Prozessor Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz 1.90 GHz

Installierter RAM 8,00 GB (7,88 GB verwendbar)

Systemtyp 64-Bit-Betriebssystem, x64-basierter Prozessor

Server:

Docker Version: Docker version 20.10.22, build 3a2c30b

Python Version: Python 3.10 (Same version because of the Docker Image)

Name: VPS 2000 G10 (KVM-VServer with 8 core and 12GB memory)

I also tried it without Docker and i still had the same problem, I dont think Docker is the problem.

Example Site:

Prodirectsoccer Upcoming Dunk Releases

Iam just requesting the site and waiting for 15 seconds with (time.sleep()) to request it again to check for any upcoming releases.

This takes 0.8-1 second on my laptop.

Iam not doing any calculations everything is io/bounded (just waiting for stuff)

xwe10 · 22. März 2023

Thank you

No i'am located in Germany and that the server is slower than my laptop is my problem. Because this dosent make sense its the same code and if I run it on the server its way slower for no reason and the response times are inconsistenc. The server has more cores than my laptop and the internet speed is also way higher at the server.

1: I can use threads and they speed up my software because they are waiting most of the times. (i/o bounded work)

This can't be the problem because the software is 100% working on my laptop so the vcores of the server are handling threads differently I guess because when i used processes i didnt had these problems with the server.

xwe10 · 22. März 2023

Zitat von H6G

This only makes partly sense.

What programming language / runtime environment did you use?

What functions did you use to run your software in different processes?

What functions / libs did you use to shift this into threads?

I use python and I used the multiprocessing library for the processes in the past and switched to threads by using the threading library.

The software is written by myself and is monitoring multiple websites for restocks of rare products.

So most of the time the threads are just waiting for the website to respond. (i/o bounded threads)

xwe10 · 22. März 2023

Zitat von H6G

You changed what exactly? From exec's to POSIX threads or what can I imagine here?

Should not.

I switched from running parts of the software in multiple processes to running them in multiple threads to improve the memory consumption because the threads are more lightweight.

xwe10 · 22. März 2023

Hello Netcup Community,

Since a recent update of my software where I switched from multiprocessing to multithreading, I have noticed a huge decrease in the performance of the software.

On my laptop, which is much worse than the server performance wise, the software takes only 1 second for a request while on the server it consistently varies between 6-12 seconds.

The software uses multiple I/O bounded threads and runs in Docker.

The server is a KVM-VServer with 8 core and 12GB memory.

Could the V-cores of the KVM server be causing the problems?

Here is a example thread that is doing i/o bounded work (waiting for the website to respond):

2023-03-22 12_26_22-prodirectsoccer_release.log - monitor-service - Visual Studio Code.png

My Laptop

2023-03-22 12_27_01-prodirectsoccer_release.log - monitor-service - Visual Studio Code.png

The Server

Beiträge von xwe10

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer

Problem with Multithreading on an KVM vServer