How does Django work internally?

Posted by Hitul Mistry

/

17 Jul 21

Tagged under: #django

Hitul Mistry presented a talk at Europython 2022, Dublin, Ireland on the Walk-through of Django internals.

Watch Video

Summary of talk

How do Django starts

Django can be started with a simple command

python manage.py runserver

Django performs the below steps to start the server.

  • Find management commands

    • django.core.management.ManagementUtility has to execute() () method that gets called.
    • ManagementUtility.execute() is the front door to execute any management commands.
    • Load models in all the apps.
    • Start HTTP Server.
  • Parse arguments

    • Django calls CommandParser.parse_args() to parse the command line arguments.
    • CommandParser inherits python’s built-in ArgumentParser class and overrides the parse_args method. The overridden method makes Django error messages more relevant.
    • Django prepares the list of the possible management commands and later tries to compare with the existing list command.
    • In case, we write python manage.py runserver then django also tries to recommend the best match runserver.
  • Load settings

    • Loads the settings. Django finds the settings.py path from a DJANGO_SETTINGS_MODULE environment variable. It gets initialized in the manage.py module.
    • In case, Django could not find the module on that path, it raises the exception.
    • Django doesn’t load the settings unless we try to use the variable inside the settings. For instance, till the time we try to call settings., Django does not load them.
    • At the time of trying to access an attribute from settings, Django internally dynamically imports the settings module, loops over all the attributes in that module and in the end stores them in a class variable as key-value pair.
    • LazySettings has setup method which loads the settings into a class variable.
    • Django implements lazy behaviour by implementing python class methods getattr, repr, setattr, delattr.
    • Django has configure() method which can be called to dynamically configure the settings. If this method is called before loading the settings then Django will not import the module from DJANGO_SETTINGS_MODULE.
    • The Django configure method cannot be called once Django imports the settings module.
  • Load App Configuration and Logging

    • Django calls django.setup() method which loads all the django apps, loads modules and marks them ready.
    • Django loads the logging settings from the settings.py. Django uses Python’s logging module.
    • Loads app configurations from individual apps. It can be found in apps.py in all the Django apps. Django finds it in Django’s internal code base as well as, apps.py.
    • Django tries to find the classes in app.py which are inherited from django.apps.registry.Apps. In case it finds two classes, it tried to find one which has default marked. If Django finds more than one class marked default, it raises an exception.
    • Load the models in all the apps.
    • Django holds the operation in between by raising exceptions in case it finds a problem while loading models and apps.
  • Start HTTP Server

    • runserver.py is the management command in django’s core application.
    • Django has basehttp module (django.core.servers.basehttp) run() method checks for threading and runs the server.
    • socketserver.ThreadingMixIn is used for threading and wsgiref.simple_server.WSGIServer for HTTPServer.
    • Implementation is multithreaded if threading is true.
  • Auto reloader

    • Django runserver auto restarts itself in case any file in the code changes.
    • Django has two approaches to implement it. StatReload is the default way Django implements whenever we start the server. -StatReloader
      • Django prepares the list of the files along with their modified time which are in the scope of running the Django code.
      • Background thread spawned by Django which internally checks at every second for file modification time. In case it finds a change in any of the file’s modification time, Django reloads the code.
      • Watchman
        • Watchman is the utility which is more efficient compared to the StatReloader. watchmap leverages OS features such as,
          • Inotify - Linux
          • FSEvents / kqueue - Mac
          • Windows - Beta
        • On change in file, OS sends a signal to Django. Django restarts on the signal.
        • Installing pywatchman enables the watchman support in Django.

How does request works?

Lets see, simple HTTP request,

curl --location --request POST 'http://localhost:8000/test/' \
--header 'Content-Type: application/json' \
--data-raw '{
    "key": "value"
}'

Request web server receives

Above request is converted into a raw request as below while traveling on the network. We used wireshark to capture below raw request data.

Request gets converted into raw text.

Frame 1: 372 bytes on wire (2976 bits), 372 bytes captured (2976 bits) on interface lo, id 0
Ethernet II, Src: 00:00:00_00:00:00 (00:00:00:00:00:00), Dst: 00:00:00_00:00:00 (00:00:00:00:00:00)
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
Transmission Control Protocol, Src Port: 40806, Dst Port: 8000, Seq: 1, Ack: 1, Len: 306
Hypertext Transfer Protocol
    POST /test/ HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): POST /test/ HTTP/1.1\r\n]
            [POST /test/ HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: POST
        Request URI: /test/
        Request Version: HTTP/1.1
    Content-Type: application/json\r\n



User-Agent: PostmanRuntime/7.26.5\r\n
    Accept: */*\r\n
    Cache-Control: no-cache\r\n
    Postman-Token: f821ed79-842b-4681-816a-a06f593a4c98\r\n
    Host: localhost:8000\r\n
    Accept-Encoding: gzip, deflate, br\r\n
    Connection: keep-alive\r\n
    Content-Length: 22\r\n
        [Content length: 22]
    \r\n
    [Full request URI: http://localhost:8000/test/]
    [HTTP request 1/1]
    [Response in frame: 50]
    File Data: 22 bytes
JavaScript Object Notation: application/json
{"key": "value"}

Let’s see how a simple http request gets served by Django.

HTTP clients sends the HTTP request in HTTP protocol. Request received by webserver such as Gunicorn, uWSGI, Nginx, runserver(django). Webserver and Django communicate with WSGI protocol.

Inside the main root project application, there is a wsgi.py module. Module has method called get_wsgi_application() which internally calling WSGIHandler().

WSGIHandler():
    def __init__(self, *args, **kwargs):
        # initialization

    def __call__(self, environ, start_response):
        # Gets called whenever request comes.

WSGIHandler() has two methods init and call methods. init gets called on HTTP server startup and call gets called whenever any web HTTP request comes.

Wsgiref parses the raw request we show in the request and converts them into the parsed dictionary.

Dictionary contains all the request parameters.

{ 'HTTPACCEPT': '/_', 'HTTP_ACCEPT_ENCODING': 'gzip, deflate, br', 'HTTP_CACHE_CONTROL': 'no-cache', 'HTTP_CONNECTION': 'keep-alive', 'HTTP_HOST': 'localhost:8000', ... 'wsgi.errors': <_io.TextIOWrapper name='' mode='w' encoding='UTF-8'>, 'wsgi.file_wrapper': <class 'wsgiref.util.FileWrapper'>, 'wsgi.input': <django.core.handlers.wsgi.LimitedStream object at 0x7fe64482e898>, 'wsgi.multiprocess': False, 'wsgi.multithread': True, }

call method calls get_response method which is passed as a function argument. Internally it matches route, execute middlewares, executes view. On each view and middleware calling Django has exception handling implemented.

How does Django ORM works?

ORM has components,

  • Model
  • Manager
  • QuerySet
  • Query
  • SQLCompiler
  • DatabaseWrapper
  • Database Driver

Django Model

class College(models.Model):

name = models.CharField(max_length=100)
address = models.CharField(max_length=400)

class Student(models.Model):

Name = models.CharField(max_length=200)
enr_no = models.IntegerField()
college = models.ForeignKey(College)

  • If we import models and try to print attributes in the model then we get below result.

  • models.College.**dict**

mappingproxy({'__module__': 'debug.models',
              '__doc__': 'College(id, name, address)',
              '_meta': <Options for College>,
              'DoesNotExist': debug.models.College.DoesNotExist,
              'MultipleObjectsReturned': debug.models.College.MultipleObjectsReturned,
              'name': <django.db.models.query_utils.DeferredAttribute at 0x7fdad9b6ed30>,
              'address': <django.db.models.query_utils.DeferredAttribute at 0x7fdad9b6ed68>,
              'id': <django.db.models.query_utils.DeferredAttribute at 0x7fdad9b6ee80>,
              'objects': <django.db.models.manager.ManagerDescriptor at 0x7fdad9b6eef0>,
              'student_set': <django.db.models.fields.related_descriptors.ReverseManyToOneDescriptor at 0x7fdad9b78588>})
  • User models are inherited from the django.db.models.base.Model.
  • Model has metaclass django.db.models.base.ModelBase prepares the model attributes.
  • add_to_class is the method which adds the new attributes to method. It checks for contribute_to_class method.
  • _meta is the instance of django.db.models.options.Options class defines utilities.
  • Model also has _state which is the instance of ModelState.
class ModelState:
    db = None
    adding = True
    fields_cache = ModelStateFieldsCacheDescriptor()
  • ModelState class stores the information about on save method call on the model, should Django initiate update or insert.
  • Whenever we initialize the model (student_instance = Student(name=”Hitul Mistry”, enr_no=”123456”, college=college_instance)) at that time, adding value will be true. Now, later, if we will call save on student_instance then django will do the insert query.
  • Now, if we modify the value after getting the value from the database and later, modify it,
student_instance = models.Student.objects.filter().last()
student_instance.name = “Hitul Mistry”
student_instance.save()
  • Here on above example, on assigning the name attribute to student_instance will mark adding in _state to false and later on save method call, Django will do the update query.
  • fields_cache stores the foreign key objects on calling prefetch_related().
  • from_db method gets called on django.db.models.base.Model before passing the results from the database results to instance. It initializes model with values received from database.
  • Implements eq, str, repr, hash, getstate, setstate to allow certain operations on model object.
  • Django Models also has Fields(models.IntegerField, models.CharField etc.) which inherited from django.db.models.Field.
  • Each field has a method called contributeto_class method which gets called at the time of model class creation by meta class. Some of the methods which are specific to the field are added to the model. For example, get_next_by<field_name> is added by DateTimeField.
  • get_internal_type will check into the database’s internal type to DB type mappings. data_types mapping will be found in the DatabaseWrapper. It is generally used in migrations for building the query. In case it could not find the value from the mapping then db_type method will be called.
data_types = {
				‘Autofield’: ‘serial’,
				‘CharField’: ‘varchar(%(max_length)s),}
  • from_db_value() converts value to python type from database type. Example, timezone conversion to given timezone).
  • get_db_prep_save() called before saving into database.

Django Manager

debug_models.College.objects.create(
   name="ABC College",
   address="Abc college campus, rolland street road."
)
  • from_queryset method dynamically builds the inheritance class.
  • Manager overrides all the public methods and queryset_only false from queryset at the runtime.
  • Create method calls Model’s save method internally.
  • Create method returns the model object.
  • Can create multiple managers but _default_manager should be true.

Django Query

  • Holds the values for compiler.
  • django.db.models.sql.Query is inherited by django.db.models.sql.subqueries, it has different query classes such as InsertQuery, AggregateQuery, UpdateQuery etc.
  • Each query has their own methods insert_values, add_update_fields, add_related_update, add_filter etc.
  • Each Query has their separate Compiler attached as an attribute compiler. (django.db.models.sql.compiler)
    • SQLInsertCompiler
    • SQLAggregateCompiler
    • SQLUpdateCompiler
    • SQLDeleteCompiler
    • SQLCompiler
  • as_sql method, which prepares the query and later executed by execute_query method.
  • DatabaseWrapper executes the query and returns the values.

Django Queryset

  • Filter returns the queryset object.
  • Querysets are container for objects.
  • Querysets are lazy.
  • They implement repr, iter, len, bool, getitem etc.
  • Querysets has cache.
  • queryset.iterator(chunksize=100) should be used whenever required. Prefetch_related is not supported.
  • django.db.models.sql.Query class has different methods add_filter, add_q, add_select_related, add_annotation, add_extra, add_ordering etc to hold the data.
  • SQLCompiler forms the select query for the database and DatabaseWrapper executes it.

How do Django does query in query chaining?

debug_models.College.objects.filter(
   name="ABC College"
).filter(address__contains=”abc”)
  • Filter returns the QuerySet object and holds the query arguments.
  • as_sql method in SQLCompiler forms the SQL query based on the parameters passed.
debug_models.College.objects.filter(
   name="ABC College"
).update(name=”BBC College”)

  • It calls simply update query to database and returns number of records updated.
  • It does not call save method and also no post and pre save signals will not be called.
  • django.db.models.sql.subqueries.UpdateQuery has add_update_values, add_related_update etc which stores the data in Query for the compiler.
  • SQLUpdateCompiler will form the query and DatabaseWrapper will execute the query on database.
debug_models.College.objects.filter(
   name="ABC College"
).delete()
  • Django does SQL Delete query into database and also sends the pre and post delete signals.
  • Executes raw query like DELETE FROM "debug_college" WHERE "debug_college"."id" IN (36, 35, 34, 33, 32, 31, 30);
  • django.db.models.deletion.Collector class collects the objects to be deleted(collect()), deleting the objects(delete()) and sending the pre and post delete signals.
  • SQLDeleteCompiler forms the SQL query for the actual delete.
  • Related objects has on_delete which defines the behavior on related objects deletion. Supported options are CASCADE, PROTECT, RESTRICT.
  • Raw queries can be used for better performance.

Django Database Wrapper

django.db.backends.<database>.base.py

  • It provides the methods for creating connections and cursors.
  • It contains django model type to db type mappings.
  • It also contains database level different mappings for operators(exact, iexact, regex etc), pattern mappings(contains, icontains etc.) etc.
operators = {
'exact': '= %s',
'iexact': '= UPPER(%s)',
'contains': 'LIKE %s',}

pattern_ops = {
'contains': "LIKE '%%' || {} || '%%'",
'icontains': "LIKE '%%' || UPPER({}) ||
'%%'",
'startswith': "LIKE {} || '%%'"}

data_types = {
'AutoField': 'serial',
'BigAutoField': 'bigserial',
'BinaryField': 'bytea',
...
}

DatabaseFeatures

django.db.backends.<database>.features.py

  • It contains certains attributes which helps Django to form query or raise exception while using it. Example features are,
class DatabaseFeatures(BaseDatabaseFeatures):
allows_group_by_selected_pks = True
can_return_columns_from_insert = True
can_return_rows_from_bulk_insert = True
has_real_datatype = True.

DatabaseOperations

django.db.backends.<database>.operations.py

  • It contains the common operations queries which can be leveraged by django models, fields and compiler.
  • Examples set_time_zone_sql, datetime_cast_date_sql, distinct_sql etc.

About Us

We are a trusted, quality driven and value-driven digital product development company delivering services in BFSI sector.

Digiqt Technolabs is a passion turned into a company. We are a trusted product development company that specializes in turning your ideas into digital solutions utilizing our years of experience in industry-leading technologies.

We deliver high-tech innovations and great solutions for our clients in the most efficient manner regardless of the project complexity.

We are trusted, quality-driven and value-driven product development company.

Our key clients

Companies we are associated with

Life99
Edelweiss
Kotak Securities
Coverfox
Phyllo
Quantify Capital
ArtistOnGo
Unimon Energy

Our Offices

Ahmedabad

706,31FIVE Building,opp.Palladium, Corporate Rd, Makarba, Ahmedabad, Gujarat.

+91 99747 29554

Mumbai

WeWork, Enam Sambhav C-20, G Block,Bandra- Kurla Complex, MUMBAI-400051, Maharashtra.

+91 99747 29554

Stockholm

Bäverbäcksgränd 10 12462 Bandhagen, Stockholm, Sweden.

+46 72789 9039

software developers ahmedabad

Call us

Career : +91 90165 81674

Sales : +91 99747 29554

Email us

Career : hr@digiqt.com

Sales : hitul@digiqt.com

© Digiqt 2024, All Rights Reserved