Friday, September 30, 2011

Sage Days 34

I'm currently at Sage/Singular Days in Kaiserslautern.

The people are very nice and we do much coding. I even gave a talk.

Friday, August 26, 2011

Awful Benchmarks

come deep enough for good alphabetics.

I don't like benchmarking.
The features are too subtle to create a benchmark without a huge overhead.

And decorating a pass-function is somehow unsatisfying. On easy computations, the decorator overhead is relatively higher. Therefore, a pass-function is probably the worst case.

Nevertheless, here are some numbers:
  • calling a pass-function:
    100000 loops, best of 3: 110 ns per loop
  • calling the above function after decorating:
    100000 loops, best of 3: 295 ns per loop
  • calling a pass-function, that takes some parameters:
    100000 loops, best of 3: 399 ns per loop
  • calling the above function after decorating:
    100000 loops, best of 3: 796 ns per loop
  • defining a pass-function:
    100000 loops, best of 3: 66.4 ns per loop
  • decorating a pass-function:
    100000 loops, best of 3: 24.4 µs per loop
The numbers look quite good. While decorating is really slow (because inspect.getargspec is), calling is really fast.
The difference is less than twice a pass-function (and that is worst-case).

Michael had the idea to benchmark a polynomial addition:
sage: from sage.misc.sage_timeit import sage_timeit
sage: P.<x,y,z>=QQ[]
sage: f = x * y + z
sage: def add(a, b):
....:     return a + b
sage: from sage.citation import cites, citable_items
sage: @cites(citable_items.libsingular)
....: def add_citing(a, b):
....:     return a + b
sage: sage_timeit("add(f, x)", {'add':add, 'f':f, 'x':x},
....:     number=100000, preparse=False)
100000 loops, best of 3: 279 ns per loop
sage: sage_timeit("add(f, x)", {'add':add_citing, 'f':f, 'x':x},
....:     number=100000, preparse=False)
100000 loops, best of 3: 712 ns per loop
Although it looks like the computation is significantly slower, the loss of speed is acceptable. With slower computations (with more complex polynomials), one will not be able to measure the difference.
sage: from sage.misc.sage_timeit import sage_timeit
sage: P.<a,b,c> = PolynomialRing(QQ,3, order='lex')
sage: I = P.ideal([a,a+1])

# Without example-usage applied
sage: sage_timeit("I.groebner_basis('libsingular:slimgb')",
....:     {"I":I}, number=100000, preparse=False)
100000 loops, best of 3: 2.09 µs per loop
sage: sage_timeit("I.groebner_basis('libsingular:slimgb')",
....:     {"I":I}, number=100000, preparse=False)
100000 loops, best of 3: 2.14 µs per loop

# With decorator and citations.add
sage: sage_timeit("I.groebner_basis('libsingular:slimgb')",
...:      {"I":I}, number=100000, preparse=False)
100000 loops, best of 3: 2.06 µs per loop
sage: sage_timeit("I.groebner_basis('libsingular:slimgb')"
...:      {"I":I}, number=100000, preparse=False)
100000 loops, best of 3: 2.16 µs per loop
As you can see, the error in measurement is way higher than the loss of speed. That indicates, that the decorator is really fast.
If you are into groebner bases (I am not), you probably know, that this is a rather trivial computation. Easy computations are bad for benchmarking results (take a look above).
Still, the results are very good.

For detailed information, please take a look at bitbucket.

citation service

Thanks to Burcin Erocal, we now have a document describing the citation module.

Friday, August 12, 2011

Uploading ...

I just uploaded a few patches to sage-trac.

But first things first. Michael and I, we were at the University of Kaiserslautern to visit Burcin Erocal from Tuesday through Thursday. We had quite a nice time together and were coding a lot.

The documentation coverage of the files, I submitted to sage-trac, is now 100%. That was quite tough and we found way too many bugs during doctesting.


The design changed again, but only a little.

In short:
  • "citable_items" holds citation information
  • "@cite" is the decorator to mark functions to be using a citable item
  • Used citations are available through "citations".
They are available through "sage.citation".

Detailed information is available in the docstrings.

Friday, July 22, 2011

The Design

This week, Michael came up with the idea to write a blog about my work. Please start reading it from the beginning. The Blog should contain the progress of my work. It is also meant to simplify the communication with new contributors, me and interested experts.

We put much effort on improving the design and we made great progress. Currently, the design is the follwing (I hope, we can keep that).

citable.py
Contains all the meta-class-magic: CitationBase provides some basic functions, LoadableCitation can load the bib-file using pybtex, if the citation is needed. Then, it will change the class (!) of the element to Citation, which can print the citation using pybtex. We get a cache for free.
The class Citable: all classes representing cit-able elements in sage (citables.py) inherit from this class. They all have the (meta-)class LoadableCitation, when they are created. They can specify a type (default: software) and a filename to the bib-file (if it differs from the class' name).

citables.py
from citable import Citable

class singular(Citable):
    pass

class magma(Citable):
    pass

class slimgb(singular)
    pass
Note, that the code only holds the dependencies of citations, the rest is done automatically. (There are more examples below.)

citation.pyx
A cython-file containing the "ugly stuff". You can decorate your own function with @use_citation(citable). If the function is called, the citation will be added to the set of used citations. You can also manually add to this set by calling used_citations(citable). If you want to get all the used citations, simply call: print used_citations and it will give you back all you need (for this sage-session).
You can also refer to all formats supported by pybtex (bibtex, bibtexml, bibyaml, latex, html, plaintext) by calling used_citations.print_all(format).

>>> @use_citation(citables.magma)
... def simple_func():
...     pass
>>> print used_citations
None
>>> simple_func()
>>> print used_citations
@article{
    MR1484478,
    author = "Bosma, Wieb and Cannon, John and Playoust, Catherine",
    volume = "24",
    doi = "10.1006/jsco.1996.0125",
    title = "The {M}agma algebra system. {I}. {T}he user language",
    url = "http://dx.doi.org/10.1006/jsco.1996.0125",
    journal = "J. Symbolic Comput.",
    issn = "0747-7171",
    mrclass = "68Q40",
    number = "3-4",
    note = "Computational algebra and number theory (London, 1993)",
    mrnumber = "MR1484478",
    year = "1997",
    pages = "235--265",
    fjournal = "Journal of Symbolic Computation"
}
>>> print used_citations.print_all("latex")
\begin{thebibliography}{8}

\bibitem[1]{MR1484478}
Wieb Bosma, John Cannon, and Catherine Playoust.
\newblock The {M}agma algebra system. {I}. {T}he user language.
\newblock \emph{J. Symbolic Comput.}, 24:235--265, 1997.

\end{thebibliography}

Dependencies of citations are also possible.
>>> used_citations.forget()
>>> used_citations(citables.slimgb)
>>> used_citations.format = "latex"
>>> print used_citations
\begin{thebibliography}{8}

\bibitem[1]{DGPS}
W.~Decker, G.-M. Greuel, G.~Pfister, and H.~Sch\"{o}nemann.
\newblock {\sc Singular} {3-1-3} --- {A} computer algebra system for polynomial computations.
\newblock 2011, http://www.singular.uni-kl.de.

\bibitem[2]{slimgbrevista}
Michael Brickenstein.
\newblock Slimgb: {G}r{\"o}bner bases with slim polynomials.
\newblock \emph{Revista Matemática Complutense}, 23, issue 2:453--466, 2010.

\end{thebibliography}

There is a logging feature, which I'm not happy about yet. Furthermore, the documentation needs huge improvements.

It is very interesting to see, how the low-level stuff like cython-optimization, calling directly into Sets and Tuples, works so well together with high-level magic like metaclasses, changing a class' (meta-)class.

The thing about performance

My third week was more relaxed. My boss was on a business trip and I was alone in the office all day.
Well, I learned cython and made some attempts on integrating pybtex into my sage-citations. Furthermore, I learned about sage's SPKGs and made my first contribution to sage. In addition, I cut metaclasses, because I didn't make them work with cython. (They'll come again, don't worry.)

But I have to write about our performance problems. It is actually the reason, why Michael made me take a look a cython.
It would be very nice, if the citation-implementation would be widespread into sage. You could just use sage as usual, and then say "give me everything", and it will give you back all necessary citations. But to achieve this, we need many sage-functions to be decorated like "this function uses software_z and refers to paper_xy". This makes every function call slower. If you could notice this loss of speed, no-one would use the decorator.
You can easily see the performance-orientation in the code. Some things are actually pretty ugly (I'm always open for suggestions on improvements).

If you don't like talking to me, talk to Burcin Erocal.

Starting over and over ...

You can always take a look at the code (although it might not work from time to time, due to ongoing development). You might also refer to the repository as my micro-blog.

Please write me a message on BitBucket or comment on one of my Blog Posts, if you want to get in contact with me.

My second week was a week of trying and re-trying. I made several attempts for a viable code design. It took much rewrites and we did a lot of experiments. There was a function-only design, a mixed design with classes and functions, one with no code outside the classes. And finally, metaclasses. I learned metaclasses in my second week of python. (In short: a class can have a class, therefore, be instance of that class.) Michael made a performance test for me, so that we can optimize the performance more efficiently.
I will explain the performance problem later, after I have explained the (planned) design of the whole citation-implementation.

During my second week, I wrote a lot of documentation (+ doctests). I knew, that it would not survive the following days, but I also knew, that I will have to be able to write documentation about my code at any time.
Actually, it was fun too.

Hello World!

This is the first entry in my first blog.
Nota Bene: If you don't know what python is (and I don't mean the snake), you should probably leave now. Otherwise, this Blog could be fun for you :-).

But let's start from the beginning.
I started working at the mathematical research institute of Oberwolfach on the 27th of June. I was advised to read the Chapters one through six of DiveIntoPython before starting to work here.
But learning python wasn't over, when I came here. The other guy in my office (actually it is his office and he is my boss) taught me closures, nested functions and decorators.

Here comes an easy example of what a decorator is (in case you don't already know):
def my_function():
    print "that"

def my_decorator(f):
    print "this"
    f()
    print "the other"

my_function = my_decorator(my_function)

my_function()
this
that
the other

But python has a shortcut for decorators:
@my_decorator
def moo():
    print "moo"

moo()
this
moo
the other

I never had to do with python before. As you can imagine, it was brain-twisting (but even more fun). python's feature, that everything is a first-level-object, is really great, and it makes my work easier and more demanding at the same time.

My boss Michael Brickenstein keeps saying, that he feels guilty for torturing me with nested functions in my first days of python. Therefore, he sometimes brings self-baked cookies to the office.

From Tuesday onwards, a collaborator of this project came from Berlin to visit Michael. We three had a lot of fun discussing and coding together. I was able to take a look at his work (mainly sql). Therefore, I didn't do much python programming for the rest of the week.
But I found out, what my position within this project will be. The project - by the way - is called S-MATH and has to do with citing mathematical software. The project just started and will probably take a long time.
As far as I understood, I should be the one, who gets in touch with the sage community and should try to make an implementation for citing mathematical software into sage.

If you are involved in the sage community, this Blog is exactly right for you, we hope for rich communication.
I hope, that the sage community will accept me and I will always be open for suggestions and comments on my work.