Python and Django II - Prefetch

Aug 15 2019

Picture here

A handy Queryset method is called prefetch_related(). It returns a new Queryset that also retrieves, in a single batch, related specified objects.


This can be useful for for loops. For-loops are potentially dangerous because if data isn't cached, each time we iterate through the loop; if a Foreign Key variable needs to be accessed another database query will be made to retrieve the value. With large querysets the hits to the database can get out of hand fast and slow the whole program down (E.g as an app scales it gets slower and slower as a result of more and more database queries).


We can avoid the extra database hits by using prefetch_related method to load the Foreign Key values and cache those results for each object (prefetch_related works for M2M fields as well as One-to-One).


This can cut the number of database queries down to TWO...instead of hundreds. A bit of care must still be taken when dealing with M2M relationships because in certain cases a call to the database may still be made (best to check for sure using a debug statement).


One approach to use in order to avoid these situations is to use caching together with getattr() and setattr() (The Prefetch method can also be used). Basically, we use getattr() to set a default of None, and then set the attribute to be what ever we need, for example:


def get_all_actual_objects(self):
cache_name = "_all_actual_objects"
result = getattr(self, cache_name, None)
if result is None:
result = self.experimental_object_queryset.filter(
actual__isnull=False,
foreign_key__object__bool=False,
).prefetch_related(
"actual",
"foreign_key__object__associated_users__user",
)
setattr(self, cache_name, result)
return result

So above we set what we want our attribute to be with the help of prefetch_related() we also grab in one query hit extra objects that we will use is later for loops. E.g: later we might write:


all_actual_objects = self.get_all_actual_objects()
total_users = []
for object in all_actual_objects:
for profile in object.user_profiles:
total_users.append(profile)
return len(total_users)

If we hadn't pre-fetched the above for loop would need to query the database each time to access the object's user_profiles, slowing down the app. But thanks to prefetch_related() and caching we limit things to just TWO queries, a great improvement.