Skip to content

Commit

Permalink
updated to v1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
imneonizer authored Oct 12, 2019
1 parent fd850fa commit e6209b5
Show file tree
Hide file tree
Showing 8 changed files with 122 additions and 37 deletions.
86 changes: 57 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,13 @@ pip install improcess

This tiny little python module is useful for creating multiple process of any function in seconds.

#### Latest v0.1.0
### Warning

if your tasks require to do more than 4 tasks to do in parallel, kindly don't use improcess as [imthread](https://github.com/imneonizer/imthread) is fast and reliable for huge parallelization using only single core. however while bench marking i have found that i was trying to do 1 million small tasks in parallel and imthread was not sufficient neither improcess because of limited number of cores. then i did a duo compilation of these two libraries to create 5 multi process with each 2,00, 000 threads and that took half the time than creating 1 million threads on single process alone, and not to mention since my computer had only 4 cores making more than 4 processes at once doesn't seems efficient, those 1 million small tasks completed in 3 minutes. i'll update the link for the code.

#### Latest v1.0

A quick launch mode is added, just type `improcess.start(func_name, repeat=10)` and it will execute the given function given number of times in parallel. A standard way of measuring elapsed time is added as well. see examples below to understand how to use quick launch mode.

Other than that to keep a track on how many processes are been created in real time you can push in a new log method in your processing function so that whenever a new process is created you can see it. there are two methods of tracking them.

Expand Down Expand Up @@ -34,24 +40,18 @@ So what this module does is, at the time of object initialization it takes in th
import improcess
import time

def my_func(i):
def my_func(data):
process = improcess.console_log(output=True)
try:
time.sleep(5)
return i*100
except Exception as e:
improcess.stop()

time.sleep(5)
return data*100

if __name__ == '__main__':
multi_processing = improcess.multi_processing(my_func, max_process=10)
raw_data = list(range(10))
#list of input data for processing
raw_data = [1,2,3,4,5,6,7,8,9,10]

st = time.time()
result = multi_processing.start(raw_data)
et = round((time.time() - st),2)
if __name__ == '__main__':
result = improcess.start(my_func, repeat=10, max_process=10)
print(f'Result: {result}')
print(f'Elapsed time: {et} sec')
print(f'>>> Elapsed time: {improcess.elapsed()} sec')
```

#### output
Expand All @@ -73,11 +73,19 @@ Elapsed time: 5.53 sec

Now you can clearly see, if we do it without multi processing it would have taken around ``50 Seconds`` for processing the data while doing the task one by one and waiting for ``5 Sec`` after running the function each time. but since we are doing it with multiprocessing it will take only ``5 Seconds`` for processing the same task with different data, in their individual process.

**one thing to take care is:** always execute the improcess.start() as

````
if __name__ == "__main__":
improcess.start()
````

It is essential to prevent improcess from creating its own duplicate process.

#### Example 2

````python
import improcess
import time
import requests

#the function for processing data
Expand All @@ -87,21 +95,13 @@ def my_func(data):
return data

if __name__ == '__main__':
#building a imthreading object
multi_processing = improcess.multi_processing(my_func, max_process=20)

raw_data = list(range(1,21))

st = time.time()

#sending arguments for asynchronous multi processing
processed_data = multi_processing.start(raw_data)
processed_data = improcess.start(my_func, repeat=20, max_process=20)

#printing the synchronised received results
print()
#print('>> Input: {}'.format(raw_data))
print('>> Result: {}'.format(processed_data))
print('>> Elapsed time: {} sec'.format(round((time.time()-st),2)))
print(f'>> Result: {processed_data}')

improcess.elapsed(output=True)
````

#### output
Expand Down Expand Up @@ -129,7 +129,7 @@ if __name__ == '__main__':
>>> Creating Process 20

>> Result: [<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
>> Elapsed time: 6.51 sec
>>> Elapsed time: 5.92 sec
````

In this example we didn't used `time.sleep()` instead we make a request to the webserver and it took ``0.5 seconds`` to get the result back so we did it 20 times with multi processing and were able to receive the results in less time in a synchronous order.
Expand All @@ -151,6 +151,34 @@ It is clear that every request to the server was taking approx. ``0.5 seconds``

Though Expected Elapsed time is little bit slow in comparison to [imthread](https://github.com/imneonizer/imthread) library because in multi process each process have their own individual console and memory space. but both of these libraries can be used in conjunction to achieve ultra fast processing, instance we can create 4 individual processes and with every process we can call create 1000 threads. so it will be lot faster than using only [imthread](https://github.com/imneonizer/imthread) individually or [improcess](https://github.com/imneonizer/improcess) individually.

### Example 3

Quick Launch mode, a new feature is added where you can directly use improcess to pass in the repetitive function, input data for those functions and how many threads you want it to create at a time. other than that if you just want it to repeat the function without any inputs you can do that too.

### output

````python
import improcess
import random
import time

names = ['April', 'May']

#the function for processing data
def my_func(data):
improcess.console_log(output=True)
name = random.choice(names)
time.sleep(1)
return f'{name} says, Hello World!'

if __name__=="__main__":
processed_data = improcess.start(my_func, repeat=4)
print(processed_data)
improcess.elapsed(output=True)
````

we kept a time gap of 1 sec inside the function still it repeated the task 4 times in same time. since it can access the global variables we can assign certain tasks that don't need different inputs every time.

#### Handling errors and killing all process

So, by default if any error occurs the processes will keep on running, in case if you want to ignore some errors but if you want to kill all the process at once you can use ``improcess.stop()`` while handling errors.
Expand Down
Binary file added dist/improcess-1.0.tar.gz
Binary file not shown.
28 changes: 27 additions & 1 deletion improcess/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,32 @@
from improcess.improcess import multi_processing, console_log
import time

st = 0
def start(processing_func, data=None, repeat=None, max_process=4):
assert type(max_process) == int, 'max_process value should be an integer'
assert max_process>0, 'max_process value cannot be less than 1'
mp_local = multi_processing(processing_func, max_process=max_process)
global st
st = time.time()

if data:
processed_data = mp_local.start(data)
return processed_data

elif repeat:
assert type(repeat) == int, 'repeat value should be an integer'
assert repeat>0, 'repeat value cannot be less than 1'
processed_data = mp_local.start(repeat)
return processed_data

else:
print(f'data: {data}, repeat: {repeat}')

def stop():
raise Exception('stop_process')


def elapsed(output=False):
tt = round((time.time()-st), 2)
if output:
print(f'>>> Elapsed time: {tt} sec')
return tt
Binary file modified improcess/__pycache__/__init__.cpython-37.pyc
Binary file not shown.
Binary file modified improcess/__pycache__/improcess.cpython-37.pyc
Binary file not shown.
23 changes: 18 additions & 5 deletions improcess/improcess.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,16 @@ def process_data(self, data):
print(e)
finally:
return (success, data[0], processed_data)

def start(self, raw_data):
process_count = len(raw_data)

if type(raw_data) == int:
pseudo_infinity = raw_data
process_count = raw_data
else:
pseudo_infinity = len(raw_data)
process_count = len(raw_data)

if process_count < self.max_process:
self.max_process = process_count

Expand All @@ -38,8 +45,14 @@ def start(self, raw_data):
final_result = []
#marking each process with a index id
try:
for i in range(1, process_count+1):
args.append((i, raw_data[i-1], self.process))
for i in range(1, pseudo_infinity+1):
try:
index_data = raw_data[i-1]
except Exception:
index_data = i

args.append((i, index_data, self.process))

if i%self.max_process == 0:
#starting processes
result = pool.map(self.process_data, args) #internal_processing_func, arguments_list
Expand Down Expand Up @@ -69,4 +82,4 @@ def console_log(output=False):
data = p_index
if output:
print(f'>>> Creating Process {data}')
return data
return data
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
setup(
name = 'improcess', # How you named your package folder (MyLib)
packages = ['improcess'], # Chose the same as "name"
version = '0.1.1', # Start with a small number and increase it with every change you make
version = '1.0', # Start with a small number and increase it with every change you make
license='MIT', # Chose a license from here: https://help.github.com/articles/licensing-a-repository
description = 'This short little python module can help you with running your iteratable functions on multi process without any hassle of creating process by yourself.', # Give a short description about your library
author = 'Nitin Rai', # Type in your name
author_email = '[email protected]', # Type in your E-Mail
url = 'https://github.com/imneonizer/improcess', # Provide either the link to your github or to your website
download_url = 'https://github.com/imneonizer/improcess/archive/v0.1.1.tar.gz', # I explain this later on
download_url = 'https://github.com/imneonizer/improcess/archive/v1.0.tar.gz', # I explain this later on
keywords = ['Multi Processing', 'Synchronous Processing', 'Parallel execution'], # Keywords that define your package best
classifiers=[
'Development Status :: 4 - Beta', # Chose either "3 - Alpha", "4 - Beta" or "5 - Production/Stable" as the current state of your package
Expand Down
18 changes: 18 additions & 0 deletions test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import improcess
import random
import time

names = ['April', 'May']

#the function for processing data
def my_func(data):
improcess.console_log(output=True)
name = random.choice(names)
time.sleep(1)
return f'{name} says, Hello World!'

if __name__=="__main__":
processed_data = improcess.start(my_func, repeat=4)

print(processed_data)
improcess.elapsed(output=True)

0 comments on commit e6209b5

Please sign in to comment.