Friday, July 22, 2011

The Design

This week, Michael came up with the idea to write a blog about my work. Please start reading it from the beginning. The Blog should contain the progress of my work. It is also meant to simplify the communication with new contributors, me and interested experts.

We put much effort on improving the design and we made great progress. Currently, the design is the follwing (I hope, we can keep that).

citable.py
Contains all the meta-class-magic: CitationBase provides some basic functions, LoadableCitation can load the bib-file using pybtex, if the citation is needed. Then, it will change the class (!) of the element to Citation, which can print the citation using pybtex. We get a cache for free.
The class Citable: all classes representing cit-able elements in sage (citables.py) inherit from this class. They all have the (meta-)class LoadableCitation, when they are created. They can specify a type (default: software) and a filename to the bib-file (if it differs from the class' name).

citables.py
from citable import Citable

class singular(Citable):
    pass

class magma(Citable):
    pass

class slimgb(singular)
    pass
Note, that the code only holds the dependencies of citations, the rest is done automatically. (There are more examples below.)

citation.pyx
A cython-file containing the "ugly stuff". You can decorate your own function with @use_citation(citable). If the function is called, the citation will be added to the set of used citations. You can also manually add to this set by calling used_citations(citable). If you want to get all the used citations, simply call: print used_citations and it will give you back all you need (for this sage-session).
You can also refer to all formats supported by pybtex (bibtex, bibtexml, bibyaml, latex, html, plaintext) by calling used_citations.print_all(format).

>>> @use_citation(citables.magma)
... def simple_func():
...     pass
>>> print used_citations
None
>>> simple_func()
>>> print used_citations
@article{
    MR1484478,
    author = "Bosma, Wieb and Cannon, John and Playoust, Catherine",
    volume = "24",
    doi = "10.1006/jsco.1996.0125",
    title = "The {M}agma algebra system. {I}. {T}he user language",
    url = "http://dx.doi.org/10.1006/jsco.1996.0125",
    journal = "J. Symbolic Comput.",
    issn = "0747-7171",
    mrclass = "68Q40",
    number = "3-4",
    note = "Computational algebra and number theory (London, 1993)",
    mrnumber = "MR1484478",
    year = "1997",
    pages = "235--265",
    fjournal = "Journal of Symbolic Computation"
}
>>> print used_citations.print_all("latex")
\begin{thebibliography}{8}

\bibitem[1]{MR1484478}
Wieb Bosma, John Cannon, and Catherine Playoust.
\newblock The {M}agma algebra system. {I}. {T}he user language.
\newblock \emph{J. Symbolic Comput.}, 24:235--265, 1997.

\end{thebibliography}

Dependencies of citations are also possible.
>>> used_citations.forget()
>>> used_citations(citables.slimgb)
>>> used_citations.format = "latex"
>>> print used_citations
\begin{thebibliography}{8}

\bibitem[1]{DGPS}
W.~Decker, G.-M. Greuel, G.~Pfister, and H.~Sch\"{o}nemann.
\newblock {\sc Singular} {3-1-3} --- {A} computer algebra system for polynomial computations.
\newblock 2011, http://www.singular.uni-kl.de.

\bibitem[2]{slimgbrevista}
Michael Brickenstein.
\newblock Slimgb: {G}r{\"o}bner bases with slim polynomials.
\newblock \emph{Revista Matemática Complutense}, 23, issue 2:453--466, 2010.

\end{thebibliography}

There is a logging feature, which I'm not happy about yet. Furthermore, the documentation needs huge improvements.

It is very interesting to see, how the low-level stuff like cython-optimization, calling directly into Sets and Tuples, works so well together with high-level magic like metaclasses, changing a class' (meta-)class.

The thing about performance

My third week was more relaxed. My boss was on a business trip and I was alone in the office all day.
Well, I learned cython and made some attempts on integrating pybtex into my sage-citations. Furthermore, I learned about sage's SPKGs and made my first contribution to sage. In addition, I cut metaclasses, because I didn't make them work with cython. (They'll come again, don't worry.)

But I have to write about our performance problems. It is actually the reason, why Michael made me take a look a cython.
It would be very nice, if the citation-implementation would be widespread into sage. You could just use sage as usual, and then say "give me everything", and it will give you back all necessary citations. But to achieve this, we need many sage-functions to be decorated like "this function uses software_z and refers to paper_xy". This makes every function call slower. If you could notice this loss of speed, no-one would use the decorator.
You can easily see the performance-orientation in the code. Some things are actually pretty ugly (I'm always open for suggestions on improvements).

If you don't like talking to me, talk to Burcin Erocal.

Starting over and over ...

You can always take a look at the code (although it might not work from time to time, due to ongoing development). You might also refer to the repository as my micro-blog.

Please write me a message on BitBucket or comment on one of my Blog Posts, if you want to get in contact with me.

My second week was a week of trying and re-trying. I made several attempts for a viable code design. It took much rewrites and we did a lot of experiments. There was a function-only design, a mixed design with classes and functions, one with no code outside the classes. And finally, metaclasses. I learned metaclasses in my second week of python. (In short: a class can have a class, therefore, be instance of that class.) Michael made a performance test for me, so that we can optimize the performance more efficiently.
I will explain the performance problem later, after I have explained the (planned) design of the whole citation-implementation.

During my second week, I wrote a lot of documentation (+ doctests). I knew, that it would not survive the following days, but I also knew, that I will have to be able to write documentation about my code at any time.
Actually, it was fun too.

Hello World!

This is the first entry in my first blog.
Nota Bene: If you don't know what python is (and I don't mean the snake), you should probably leave now. Otherwise, this Blog could be fun for you :-).

But let's start from the beginning.
I started working at the mathematical research institute of Oberwolfach on the 27th of June. I was advised to read the Chapters one through six of DiveIntoPython before starting to work here.
But learning python wasn't over, when I came here. The other guy in my office (actually it is his office and he is my boss) taught me closures, nested functions and decorators.

Here comes an easy example of what a decorator is (in case you don't already know):
def my_function():
    print "that"

def my_decorator(f):
    print "this"
    f()
    print "the other"

my_function = my_decorator(my_function)

my_function()
this
that
the other

But python has a shortcut for decorators:
@my_decorator
def moo():
    print "moo"

moo()
this
moo
the other

I never had to do with python before. As you can imagine, it was brain-twisting (but even more fun). python's feature, that everything is a first-level-object, is really great, and it makes my work easier and more demanding at the same time.

My boss Michael Brickenstein keeps saying, that he feels guilty for torturing me with nested functions in my first days of python. Therefore, he sometimes brings self-baked cookies to the office.

From Tuesday onwards, a collaborator of this project came from Berlin to visit Michael. We three had a lot of fun discussing and coding together. I was able to take a look at his work (mainly sql). Therefore, I didn't do much python programming for the rest of the week.
But I found out, what my position within this project will be. The project - by the way - is called S-MATH and has to do with citing mathematical software. The project just started and will probably take a long time.
As far as I understood, I should be the one, who gets in touch with the sage community and should try to make an implementation for citing mathematical software into sage.

If you are involved in the sage community, this Blog is exactly right for you, we hope for rich communication.
I hope, that the sage community will accept me and I will always be open for suggestions and comments on my work.