small_world calculations

This is a follow-up issue after discussing #131 

Right now plotting function calculates the small coefficient of "real_graph" relative to random graphs whenever called. That's why it takes a while and that is not ok for plotting functions.

Our discussions led to the 2 ways how to solve it:

1) Calculate `small_world` once and save values as a **property of GraphBundle**, which means adding a field to class GraphBundle. When plotting small coefficient values, we could check if these values already exist, and do not need to calculate every time small_world.

2) Reduce the time needed for calculating *small_world* values
@KirstieJane asked to show which step takes the most time. Here it is:

<details><summary>Code under the hood</summary>
<p>

```
%%time
# Calculate the small coefficient of `gname` relative to each other graph in GraphBundle
bundleGraphs.report_small_world("real_graph")
```

What *report_small_world* does is the following:
```
%%time
global_dict = {}
for name, graph in bundleGraphs.items():
    global_dict[name] = small_coefficient(bundleGraphs["real_graph"], graph)

# Calculate the small coefficient of G relative to R
def small_coefficient(G, R):
    return small_world_sigma((nx.average_clustering(G),
                              nx.average_shortest_path_length(G)),
                             (nx.average_clustering(R),
                              nx.average_shortest_path_length(R)))

# Compute small world sigma from tuples
def small_world_sigma(tupleG, tupleR):
    Cg, Lg = tupleG
    Cr, Lr = tupleR
    return ((Cg/Cr)/(Lg/Lr))
```
</p>
</details>

Time to execute code:

![image](https://user-images.githubusercontent.com/14000852/61410863-d2458e00-a8ed-11e9-9382-621db2045ab3.png)

![image](https://user-images.githubusercontent.com/14000852/61410904-ebe6d580-a8ed-11e9-91ee-d835d0d2cea6.png)

After looking at what `report_small_world` essentially does, it is easy to notice that for each graph in pair ("real_graph", "random[i]_graph") we are calculating measures **average_clustering** and **average_shortest_path_length** again and again. 

![image](https://user-images.githubusercontent.com/14000852/61410999-0f118500-a8ee-11e9-87c3-927bd8ef0f92.png)

That's not nice, cause we already have these measure values stored as properties of a Graph. No need to calculate over and over.
```
bundleGraphs["<graph-name>"].graph["global_measures"]
```
  
So, changing the small_coefficient() to access already available values rather than calculating them again, makes `small_world` calculations really fast!

```
    return small_world_sigma((G.graph["global_measures"]["average_clustering"],
                              G.graph["global_measures"]["average_shortest_path_length"]),  # noqa
                             (R.graph["global_measures"]["average_clustering"],
                              R.graph["global_measures"]["average_shortest_path_length"]))  # noqa
```

Thanks for reading till the end :)

ps. The issue is so long because my initial goal was to document which part of code takes the most time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

small_world calculations #141

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

small_world calculations #141

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions