Informatics

Ever had one of these issues with Pycharm 2018 and Docker?

Couldn't refresh skeletons for remote interpreter
The docker-compose process terminated unexpectedly: /usr/local/bin/docker-compose -f docker-compose.yml -f .PyCharm2018.3/system/tmp/docker-compose.override.8.yml run --rm --name skeleton_generator_643129755 python
Regenerate skeletons

or

can't open file '/opt/.pycharm_helpers/pycharm/django_test_manage.py' + "No such file or directory"

Then you should clear all pycharm helpers from your docker containers and images:

docker ps -a | grep -i pycharm | awk '{print $1}' | xargs docker rm
docker images | grep -i pycharm | awk '{print $3}' | xargs docker rmi

Sometimes I just love how the open source community works nowadays! Sometimes you  use your favourite search engine to find a package/repository/etc..., just to find out thatit contains a quickstart info, a proper package.json or docker-compose.yml. All you need to do is "npm install && npm run" or "docker-compose build && docker-compose up" and you are good to go.

This helps devs to contribute to open source software soooooo much!

TL;DR: Use Docker, package.json, requirements.txt, and for the love of god, write a Quickstart section into your Readme!

If you have ever gotten this error with your mod passenger installation:

$ passenger-config restart-app
*** ERROR: Phusion Passenger doesn't seem to be running. If you are sure that it
is running, then the causes of this problem could be one of:

1. You customized the instance registry directory using Apache's
 PassengerInstanceRegistryDir option, Nginx's
 passenger_instance_registry_dir option, or Phusion Passenger Standalone's
 --instance-registry-dir command line argument. If so, please set the
 environment variable PASSENGER_INSTANCE_REGISTRY_DIR to that directory
 and run this command again.
 2. The instance directory has been removed by an operating system background
 service. Please set a different instance registry directory using Apache's
 PassengerInstanceRegistryDir option, Nginx's passenger_instance_registry_dir
 option, or Phusion Passenger Standalone's --instance-registry-dir command
 line argument.

Then you most likely have SystemD running, which uses PrivateTmp folders instead of /tmp folders.

It's annoying, but easy to fix, see this blog post:
https://www.pistolfly.com/weblog/en/2016/01/passenger-config-and-passenger-status-result-in-an-error-on-centos7.html

Inspired by a recent commitstrip: http://www.commitstrip.com/en/2017/05/06/bugs-of-the-future/

Like almost everyone else I had to test my Single Page AngularJS Application, which uses Django Rest Framework as a Backend, with Internet Explorer (11, thankfully). While my SPA works fine in Chrome and Firefox, it does not work very well in Internet Explorer (shocking, lmao).

Anyway, the obvious errors, ranging from needing polyfils to some CSS quirks, were fixed quickly.

But at some point I noticed: Why the hell are changes I make via PUT/POST/PATCH not shown when I make a GET request (retrieving all instances of a model) to the same endpoint afterwards? It kept returning the same data again and again. Where the hell did my changes go? Is my database broken? Is Internet Explorer not firing the PUT/POST/PATCH calls properly?

None of that was true. As it turns out, Internet Explorer 11 caches almost all GET requests to my REST API. I submit a change, I reload the page, and the change has disappeared. Caching at its best.

So I googled and stackoverflowed (is that a word?), and I found some people talking about cache headers. And they are god damn right... Django aswell as Django Rest Framework do not set any cache headers in the response (most likely for a good reason, better be explicit than implicit).

So I tried a couple of the provided solutions, and I must say, I was really unhappy with those approaches. They were either very repetitive (like the ``@never_cache`` decorator added to all my viewsets), or required monkey patches and other sorts of things that I do not like to see. Btw, a cache buster within my JavaScript SPA was a no-go for me.

So I thought to myself: How about I make my whole Django Rest Framework Application "uncachable"? And so I did, with a very few lines of code and as a Django Middleware:

from django.utils.cache import add_never_cache_headers

class DisableClientSideCachingMiddleware(object):
    """
    Internet Explorer / Edge tends to cache REST API calls, unless some specific HTTP headers are added by our
    application.

    - no_cache
    - no_store
    - must_revalidate
    """
    def process_response(self, request, response):
        add_never_cache_headers(response)
        return response

Don't forget to also add this middleware to your settings.

MIDDLEWARE_CLASSES = (
    ...
    'yourapp.middlewares.DisableClientSideCachingMiddleware',
)

Please note that this disables caching of ALL requests coming to your Django Application. If you are serving static files from your Django Application (instead of serving them directly from your webserver), this will affect your performance.

This middleware works for me. I later discovered that somebody else has already had a similar idea, too. The obvious disadvantage here is that it disables caching for all parts of my Django Application. This is fine for me, as I am only using Django Rest Framework and I do not want caching on client side to happen at all, but it might not be okay for some other applications.

Nevertheless, I hope this piece of code helps people that have similar problems. Also, please feel free to share your experiences and solutions to such caching problems with Internet Explorer.

Django has an interesting default behaviour for NullBooleanFields, which are used by django_filters BooleanFilter. While the String 'True' evaluates to Python Boolean True, and the String 'False' evaluates to Python Boolean False, this is not happening for the lowercase variants 'true' and 'false'. This is kind of annoying when you are using DJango Rest Filters, where you would have a REST API call like this (e.g., when calling from JavaScript):

GET /tasks/?show_only_my_tasks=true

This does not work as expected, as "show_only_my_tasks=true" evaluates to "None".

The correct usage according to Djangos NullBooleanField would have been this:

GET /tasks/?show_only_my_tasks=True

To overcome this issue, you can use the following code snippet:

class BetterBooleanSelect(NullBooleanSelect):
    """
    Djangos NullBooleanSelect does not evaluate 'true' to True, and not 'false' to False
    This overwritten NullBooleanSelect allows that
    See https://code.djangoproject.com/ticket/22406#comment:3
    """
    def value_from_datadict(self, data, files, name):
        value = data.get(name)
        return {
            '2': True,
            True: True,
            'true': True,  # added, as NullBooleanSelect does not do that
            'True': True,
            '3': False,
            'false': False,  # added, as NullBooleanSelect does not do that
            'False': False,
            False: False,
        }.get(value)


class BetterBooleanField(forms.NullBooleanField):
    """
    Better Boolean Field that also evalutes 'false' to False and 'true' to True
    """
    widget = BetterBooleanSelect

    def clean(self, value):
        return super(BetterBooleanField, self).clean(value)


class BetterBooleanFilter(django_filters.BooleanFilter):
    """
    This boolean filter allows evaluating 'true' and 'false'
    """
    field_class = BetterBooleanField

In your REST Filter you then only need to write this:

class TaskFilter(BaseFilter):
    """ Filter for Tasks """
    class Meta:
        model = Task

    show_only_my_tasks = BetterBooleanFilter()

I have created a special page with a Monero JavaScript Miner using CoinHive.

You can start the miner if you visit this page (and this page only): https://chkr.at/miner/

Note: Monero is something like Bitcoin, except for that it can be mined in a browser. I am using this as a way to allow people to say thank you.

If you want to learn more about mining Monero with coin-hive, I would like to direct you to this YouTube Video (not made by me):

 

ForeignKeys need to have the on_delete Attribute set (e.g., to models.CASCADE for a cascading delete)

This also affects existing migrations. If you have migrations that you created with Django 1.8, you will run into errors (as they do not have that attribute set in the migration).

Also see this Ticket: https://code.djangoproject.com/ticket/28677

Apps should specify the "app_name" attribute in their urls.py

If you don't do this, you might run into an error when you include that urls.py in another urls.py

Also see this Ticket: https://code.djangoproject.com/ticket/28691

SessionAuthenticationMiddleware is no longer available

If you have SessionAuthenticationMiddleware MIDDLEWARE listed (you most likely do if you are upgrading from an older Django Version), you will have to remove it from your middleware list (or tuple).

user.is_authenticated() and user.is_anonymous() are no longer available as functions

They are now properties and have to be called without the function parantheses!

More Info

 

 

Ever wondered why a certain python package was installed?

E.g., when you are installing WeasyPrint you will find that it installs a lot of other libraries, such as cffi, cariocffi and html5lib. With pipdeptree you can visualize this 🙂

pip install pipdeptree

pipdeptree

WeasyPrint==0.40
  - cairocffi [required: >=0.5, installed: 0.8.0]
    - cffi [required: >=1.1.0, installed: 1.11.0]
      - pycparser [required: Any, installed: 2.18]
  - CairoSVG [required: >=1.0.20, installed: 2.0.3]
    - cairocffi [required: Any, installed: 0.8.0]
      - cffi [required: >=1.1.0, installed: 1.11.0]
        - pycparser [required: Any, installed: 2.18]
    - cssselect [required: Any, installed: 1.0.1]
    - lxml [required: Any, installed: 3.8.0]
    - pillow [required: Any, installed: 4.2.1]
      - olefile [required: Any, installed: 0.44]
    - tinycss [required: Any, installed: 0.4]
  - cffi [required: >=0.6, installed: 1.11.0]
    - pycparser [required: Any, installed: 2.18]
  - cssselect2 [required: >=0.1, installed: 0.2.0]
    - tinycss2 [required: Any, installed: 0.6.0]
      - webencodings [required: >=0.4, installed: 0.5.1]
  - html5lib [required: >=0.999999999, installed: 0.999999999]
    - setuptools [required: >=18.5, installed: 36.5.0]
    - six [required: Any, installed: 1.11.0]
    - webencodings [required: Any, installed: 0.5.1]
  - Pyphen [required: >=0.8, installed: 0.9.4]
  - tinycss2 [required: >=0.5, installed: 0.6.0]
    - webencodings [required: >=0.4, installed: 0.5.1]

Remember those days when you just did something like

pip install numpy
pip install matplotlib

and wrote python code in some file called calculate_and_plot.py and your (data science) project just got some nice plots?

This was probably before you ever heard about Python virtual environments. And even if you did hear about it, you probably said to yourself: Why would I add another layer of complexity? I don't need that for now, It's just a little project.

"I can handle my python libraries just fine without introducing more complexity!"

Well, let me tell you this: You are both right and wrong. If your goal is just doing a little project that you will use once and then forget about it, then you really don't need a virtual environment. However, this does not mean that you shouldn't use it! You will end up having to re-visit your code at some point in time, and then you are going to ask yourself the following two questions:

  • What was this library called I used to do XYZ? (you probably wrote that down in a README anyway, right?
  • What version of said library did I use? Was it 3.1? 5.7? 1.0? 0.9rc1? Oh my god there are so many different versions!?!

Both questions are only symptoms from a problem with how Python libraries are usually managed. Most operating systems (Windows aswell as Linux) will install your Python libraries (such as numpy, matplotlib, Django, ...) into your OS Python lib-packages directory (that's also why you are usually required to do this with Admin rights or sudo).

"But virtual environments are so complex, and I really need to finish this project on time, so ..."

Let me give you a quick introduction and you will see that they are not complex at all. Also, about the time component: Not using a virtual environment could be one of these things that you might regret later (e.g., when you give your Python code to a colleague).

What are Python Virtual Environments?

Actually, the name is kind of misleading. "Virtual" usually implies that there is some kind of virtualization going on. This is not the case. It's really just a set of symbolic links (e.g., for the python binary) and directories that contain your python libraries.

What it really does is modifying your local environment variables and it tells the shell where to find the python interpreter and python libraries.

How do I create a Python Virtual Environment?

IMHO the best and simplest way to create and manage your Virtual Environments, or "venvs" is to do it in your local project folder. Assuming you have the following project:

  • research_paper_876/
    • statistics.py
    • data/
      • run1.csv
      • run2.csv
      • run3.csv
    • plots/
      • run1.png
      • run2.png
      • run3.png

Then you would create your virtual environment within the folder research_paper_876 like this:

cd research_paper_876
virtualenv -p python3 venv

This will create a folder called venv in your research_paper_876 directory. Note: If you are using git, svn or any other versioning system, I recommend adding an exception for the venv directory. DO NOT ADD THE VENV DIRECTORY TO YOUR VERSIONING CONTROL SYSTEM!

Your directory structure will now look like this:

research_paper_876/

  • statistics.py
  • data/
    • run1.csv
    • run2.csv
    • run3.csv
  • plots/
    • run1.png
    • run2.png
    • run3.png
  • venv/
    • bin/
      • python (symbolic link to your python installation)
      • pip (symbolic link to pip)
      • ...
    • include/
    • lib/
      • python3.*/
        • site-packages/
          • ...
        • ...

Okay, next step: Activate your venv!

This is done with the following shell command:

source venv/bin/activate

Often you will find that your shell shows you that you have activated a virtual environment by adding a prefix, e.g.:

ckreuzberger@localhost:~/research_paper_876$ source venv/bin/activate
(venv)ckreuzberger@localhost:~/research_paper_876$

Now that you have activated your venv, you can install the desired libraries (e.g., numpy and matplotlib).

pip install numpy matplotlib

This will install these libraries and all required dependencies into your venv/lib/python3.*/site-packages/ folder.

If you now run your python code (e.g., python statistics.py) within your venv, only the libraries installed in your venv will be used.

Two more things you need to know:

First: Create a file called requirements.txt in your projects main directory by using the following command:

pip freeze > requirements.txt

This will fill your requirements.txt with a set of libraries and versions. When I wrote this tutorial it looked like this:

numpy==1.13.1
matplotlib==2.0.2

Second: If you finished working with your project, you should deactivate your venv by running the following command:

deactivate

How to re-create the same environment later

If you give your project to a colleague, or publish it on github, etc..., you would supply your code and the requirements.txt file. Your colleague can then create the exact same python virtual environment by executing the following commands:

virtualenv -p python3 venv
source venv/bin/activate
pip install -r requirements.txt

This is something you could (and should) write into a README file, so you and potentially others don't forget about it later.

 

Where can I read more about this?

I recommend reading the official docs on python.org about virtual environments: https://docs.python.org/3/tutorial/venv.html