Generally, all core functions have plain names and almost all are not 'bodied' or infix operators. The file "src/yacasapi.cc" in the source tree lists declarations of all kernel functions; consult it for reference. For many of the core functions, the script library already provides convenient aliases. For instance, the addition operator "+" is defined in the script scripts/standard while the actual addition of numbers is performed through the built-in function MathAdd.
The type of an object is returned by the built-in function Type, for example:
In> Type(a); Out> ""; In> Type(F(x)); Out> "F"; In> Type(x+y); Out> "+"; In> Type({1,2,3}); Out> "List"; |
Note that atoms that result from an Atom() call may be invalid and never evaluate to anything. For example, Atom(3X) is an atom with string representation "3X" but with no other properties.
Currently, no other lowest-level objects besides strings and lists are provided by the core engine. There is however a possibility to link externally compiled code that will provide additional types of objects. Those will be available in Yacas as "generic objects."
Evaluation when performed on an atom goes as follows: if the atom is bound locally as a variable, the object it is bound to is returned, otherwise, if it is bound as a global variable that is returned. Otherwise, the atom is returned unevaluated.
Lists of atoms are generally interpreted in the following way: the first atom of the list is some command, and the atoms following in the list are considered the arguments. The engine first tries to find out if it is a built-in command (core function). In that case, the function is executed. Otherwise, it could be a user-defined function (with a "rule database"), and in that case the rules from the database are applied to it. If none of the rules are applicable, or if no rules are defined for it, the object is returned unevaluated.
The main properties of this scheme are the following. When objects are assigned to variables, they generally are evaluated (except if you are using the Hold() function) because assignment var := value is really a function call to Set(var, value) and this function evaluates its second argument (but not its first argument). When referencing that variable again, the object which is its value will not be re-evaluated. Also, the default behaviour of the engine is to return the original expression if it couldn't be evaluated. This is a desired behaviour if evaluation is used for simplifying expressions.
One major design flaw in Yacas (one that other functional languages like LISP also have) is that when some expression is re-evaluated in another environment, the local variables contained in the expression to be evaluated might have a different meaning. In this case it might be useful to use the functions LocalSymbols and TemplateFunction. Calling
LocalSymbols(a,b) a+b; |
A function is identified by its name as returned by Type and the number of arguments, or "arity". The same name can be used with different arities to define different functions: f(x) is said to 'have arity 1' and f(x,y) has arity 2. Each of these functions may possess its own set of specific rules, which we shall call a "rule database" of a function.
Each function should be first declared with the built-in command RuleBase as follows:
RuleBase("FunctionName",{argument list}); |
In> f(a):=g(a)+1; Out> True; In> f(B); Out> g(a)+1; In> g(1+1); Out> g(1+1); In> RuleBase("g",{x}); Out> True; In> f(B); Out> g(B)+1; In> g(1+1); Out> g(2); |
The shorthand operator := for creating user functions that we illustrated in the tutorial is actually defined in the scripts and it makes the requisite call to the RuleBase function. After a RuleBase call you can specify parsing properties for the function if you like; for example, you could make it an infix or bodied operator.
Now we can add some rules to the rule database for a function. A rule simply states that if a specific function object with a specific arity is encountered in an expression and if a certain predicate is true, then Yacas should replace this function with some other expression. To tell Yacas about a new rule you can use the built-in Rule command. This command is what does the real work for the somewhat more aesthetically pleasing ... # ... <-- ... ; construct we have seen in the tutorial. Incidentally, you do not have to call RuleBase explicitly if you use that construct.
Here is the general syntax for a Rule call:
Rule("foo", arity, precedence, predicate) body; |
All rules for a given function can be erased with a call to TryRetract(funcname, arity). This is useful, for instance, when too many rules have been entered in the interactive mode. This call undefines the function and also invalidates the RuleBase declaration.
You can specify that function arguments are not evaluated before they are bound to the parameter: HoldArg("foo",a); would then declare that the a arguments in both foo(a) and foo(a,b) should not be evaluated before bound to "a". Here the argument name "a" should be the same as that used in the RuleBase call when declaring these functions. Inhibiting evaluation of certain arguments is useful for procedures performing actions based partly on a variable in the expression, like integration, differentiation, looping, etc., and will be typically used for functions that are algorithmic and procedural by nature.
Rule-based programming normally makes heavy use of recursion and it is important to control the order in which replacement rules are to be applied. For this purpose, each rule is given a precedence. Precedences go from low to high, so all rules with precedence 0 will be tried before any rule with precedence 1.
You can assign several rules to one and the same function, as long as some of the predicates differ. If none of the predicates is true, the function with its arguments evaluated is returned.
This scheme is slightly slower for ordinary functions that just have one rule (with the predicate True), but it is a desired behaviour for symbolic manipulation. You can slowly build up your own functions, incrementally testing their properties.
In> RuleBase("f",{n}); |
The Rule commands in this example specified two rules for function "f" with arity 1: one rule with precedence 10 and predicate n=0, and another with precedence 20 and the predicate that returns True only if "n" is a positive integer. Rules with lowest precedence get evaluated first, so the rule with precedence 10 will be tried before the rule with precedence 20. Note that the predicates and the body use the name "n" declared by the RuleBase call.
After declaring RuleBase for a function, you could tell the parser to treat this function as a postfix operator:
In> Postfix("f"); |
Function ("FirstOf", {list}) list[ 1 ] ; |
Function("FirstOf", {list}) |
Finally, the function FirstOf could also have been defined by typing
FirstOf(list):=list[1] ; |
Function("ForEach",{foreachitem,foreachlist,foreachbody}) |
Finally, HoldArg("function",argument) specifies that the argument argument should not be evaluated before being bound to that variable. This holds for foreachitem and foreachbody , since foreachitem specifies a variable to be set to that value, and foreachbody is the expression that should be evaluated after that variable is set.
Inside the body of the function definition there are calls to Local(...). 'Local' declares some local variable that will only be visible within a block [ ... ]. The command MacroLocal works almost the same. The difference is that it evaluates its arguments before performing the action on it. This is needed in this case, because the variable foreachitem is bound to the variable to be used, and it is the variable it is bound to we want to make local, not "foreachitem" itself. MacroSet works similarly, it does the same as Set, except that it also first evaluates the first argument, thus setting the variable requested by the user of this function. The Macro... functions in the built-in functions generally perform the same action as their non-macro versions, apart from evaluating an argument it would otherwise not evaluate.
To see the function in action, you could type:
ForEach(i,{1,2,3}) [Write(i);NewLine();]; |
Note: the variable names "foreach..." have been chosen so they won't get confused with normal variables you use. This is a major design flaw in this language. Suppose there was a local variable foreachitem, defined in the calling function, and used in foreachbody. These two would collide, and the interpreter would use only the last defined version. In general, when writing a function that calls Eval , it is a good idea to use variable names that can not easily be mistaken. This is generally the single largest cause of bugs when writing programs in Yacas. This issue should be addressed in the future.
Lists, like for example {a,b}. This then gets parsed into the internal notation (List a b) , but will be printed again as {a,b}; Statement blocks like [statement1(); statement2();] , which get parsed into (Prog (statement1) (statement2)) , and printed out again in the proper form. Object argument accessors in the form of expr[ index ] . These are mapped internally to Nth(expr,index) . index 0 returns the operator of the object, index 1 the first argument, etc. So, if expr is foo(bar) , then expr[0] returns foo, and expr[1] returns bar. Since Lists of the form {...} are essentially the same as List(...) , the same accessors can be used on lists. Function blocks like
While (i < 10) |
Strings are generally represented with quotes around them, like "this is a string" . \ in a string will unconditionally add the next character to the string, so a quote can be added with \" .
You can tell the interpreter that a function can see the local variables from the calling environment by declaring UnFence(funcname, arity) on the specified function.
One example of how rules can produce unwanted results is the rule a*0=0. This would always seem to be true. However, when a is a vector, like a:={b,c,d} , then a*0 should actually return {0,0,0} , that is, a null vector. The rule a*0 -> 0 actually changes the type of the expression from a vector to a integer! This can have severe consequences when other functions using this expressions as an argument expect a vector, or even worse, have a definition of how to work on vectors, and a different one for working on numbers.
When writing rules for an operator, it is assumed that the operator working on arguments, like Sin or *, will always have the same properties regardless of the arguments. The Taylor series expansion of Sin(a) will be the same regardless of whether a is a real number, complex number or even a matrix. The same trigonometric identities should hold for Sin, regardless of the type of a, too.
If a function is defined which does not adhere to these rules when applied to another type, a different function should be defined.
By default, if a variable has not been bound yet, it is assumed to be a number. If it is in fact a more complex object, like a vector, then you can declare a to be an 'incomplete type' vector, using Object("IsVector",a). This subexpression will evaluate to a if and only if a is a vector at that moment of evaluation. Otherwise it returns unevaluated, and thus stays an incomplete type.
So this means the type of a variable is numeric unless otherwise stated by the user, using the "Object" command. No rules should ever work on incomplete types. It is just meant for delayed simplification.
The topic of implicit type of an object is important, since many rules often implicitly need to be able to assume something is a number.
The problem mentioned above with a rule for vectors and scalars could be solved by making two rules:
a*b (b a vector) -> return vector of each component multiplied by a.
a*0 -> 0
So vector multiplication would be tried first.
The ordering of the precedence of the rules in the standard math
scripts is currently: 50-60: Args are numbers: directly calculate.
These are put in the beginning, so they are tried first.
This is useful for quickly obtaining numeric results
if all the arguments are numeric already, and symbolic
transformations are not necessary. 100-199: tautologies.
Transformations that do not change the type of the argument, and
are always true.200-399: type-specific transformations.
Transformations for specific types of objects. 400-599: transformarions on scalars (variables are assumed to be scalars).
Meaning transformations that can potentially change the type
of an argument.
"+" a "+" b "+" c "+" d e |
So our first try is to define rules for transforming an atom and for transforming sums and products. The result of CForm() will always be a string and we can use recursion like this:
In> 100 # CForm(a_IsAtom) <-- String(a); Out> True; In> 100 # CForm(_a + _b) <-- CForm(a) : " + " : CForm(b); Out> True; In> 100 # CForm(_a * _b) <-- CForm(a) : " * " : CForm(b); Out> True; |
In> CForm(a+b*c); Out> "a + b * c"; In> CForm(a+b*c*d+e+1+f); Out> "a + b * c * d + e + 1 + f"; |
CForm(a+b*c) ... apply 2nd rule CForm(a) : " + " : Cform(b*c) ... apply 1st rule and 3rd rule "a" : " + " : Cform(b) : " * " : Cform(c) ... apply 1st rule "a" : " + " : "b" : " * " : "c" ... concatenate strings "a + b * c" |
100 # CForm(+ _a) <-- "+ " : CForm(a); 100 # CForm(- _a) <-- "- " : CForm(a); 100 # CForm(_a - _b) <-- CForm(a) : " - " : CForm(b); 100 # CForm(_a / _b) <-- CForm(a) : " / " : CForm(b); |
In> CForm( (a+b) * c ); Out> "a + b * c"; |
100 # CForm(_a + _b) <-- "(" : CForm(a) : " + " : CForm(b) : ")"; 100 # CForm(- _a) <-- "(- " : CForm(a) : ")"; |
We could improve the situation by inserting parentheses only if the higher-order expression requires them; for this to work, we need to make a call such as "CForm(a+b)" aware that the enveloping expression has a multiplication by "c" in it. This can be implemented by passing an extra argument to "CForm()" that will indicate the precedence of the enveloping operation. A compound expression that uses an infix operator must be bracketed if the precedence of that infix operator is higher than the precedence of the enveloping infix operation.
We shall define an auxiliary function that we shall also call "CForm" but with a second argument, the precedence of the enveloping infix operation. If there is no enveloping operation, we shall set the precedence to a large number, e.g. 60000, so that no parentheses will be inserted around the whole expression. The new "CForm(expr, precedence)" will handle two cases: either parentheses are necessary, or unnecessary. For clarity we shall implement these cases in two separate rules. The initial call to "CForm(expr)" will be delegated to "CForm(expr, precedence)". The precedence values of infix operators such as +" and "*" is fixed in Yacas but may change from version to version, so we shall use the function OpPrecedence() to determine it. The new rules could look like this:
PlusPrec := OpPrecedence("+"); 100 # CForm(_expr) <-- CForm(expr, 60000); 100 # CForm(_a + _b, _prec)_(PlusPrec>prec) <-- "(" : CForm(a, PlusPrec) : " + " : CForm(b, PlusPrec) : ")"; 120 # CForm(_a + _b, _prec) <-- CForm(a, PlusPrec) : " + " : CForm(b, PlusPrec); |
The options you have for debugging a faulty function are Invoke the function from the command line, passing in different
arguments, to see how it behaves.Trace
In> TraceRule(x+y)2+3*5+4; |
TrEnter(2+3*5+4); TrEnter(2+3*5); TrArg(2,2); TrArg(3*5,15); TrLeave(2+3*5,17); TrArg(2+3*5,17); TrArg(4,4); TrLeave(2+3*5+4,21); Out> 21; |
Now we would like to manipulate such expressions, expanding brackets, collecting similar terms and so on, while taking care to always keep the non-commuting terms in the correct order. For example, we want Yacas to automatically simplify 2**B(k1)**3**A(k2) to 6**B(k1)**A(k2). Our goal is not to implement a general package to tackle complicated non-commutative operations; we merely want to teach Yacas about the two kinds of "quantum objects" called A(k) and B(k), and we shall define one functions that a physicist needs from these objects. The function will compute something called a "vacuum expectation value" or "VEV" of any given expression containing A's and B's. This function has "classical" values and is defined as follows: VEV of a commuting number is just that number, e.g. VEV(4) = 4, VEV(Delta(k-l)) = Delta(k-l); and VEV(X**A(k)) = 0, VEV(B(k)**X) = 0 where X is any expression, commutative or not. It is straightforward to compute VEV of something that contains A's and B's: we just use the commutation relations to move all B's to the left of all A's, and then apply the definition of VEV.
10 # A(_k)**B(_l) <-- B(l)**A(k) + Delta(k-l); |
RuleBase("**", {x,y}); Infix("**", OpPrecedence("*")); |
In> A(_k)**B(_l) <-- B(l)**A(k)+Delta(k-l); Out> True; In> A(x)**B(y) Out> B(l)**A(k)+Delta(k-l); |
RuleBase("A", {k}); RuleBase("B", {k}); RuleBase("Delta", {k}); |
In> A(y)**B(z)*2 Out> 2*(B(z)**A(y)+Delta(y-z)); |
In> A(k)*2**B(l) Out> 2*A(k)**B(l); In> A(x)**A(y)**B(z) Out> A(x)**A(y)**B(z); In> (A(x)+B(x))**B(y)*3 Out> 3*(A(x)+B(x))**B(y); |
In> FullForm( A(k)*2**B(l) ) (** (* 2 (A k ))(B l )) Out> 2*A(k)**B(l); |
A solution for this problem is not to write rules for all possible cases (there are infinitely many cases) but to systematically reduce expressions to a canonical form. "Experience has shown that" (a phrase used when we can't come up with specific arguments) symbolic manipulation of unevaluated trees is not efficient unless these trees are forced to a pattern that reflects their semantics.
We should choose a canonical form for all such expressions in a way that makes our calculations -- namely, the VEV() function -- easier. In our case, our expressions contain two kinds of ingredients: normal, commutative numbers and maybe a number of "quantum" symbols A(k) and B(k) multiplied together with the "**" operator. It will not be possible to divide anything by A(k) or B(k).
A possible canonical form for expressions with A's and B's is the following. All commutative numbers are moved to the left of the expression and grouped together as one factor; all non-commutative products are simplified to a single chain, all brackets expanded. A canonical expression should not contain any extra brackets in its non-commutative part. For example, (A(x)+B(x)*x)**B(y)*y**A(z) should be regrouped as a sum of two terms, (y)**(A(x)**(B(y))**A(z)) and (x*y)**(B(x)**(B(y))**A(z)). Here we wrote out all parentheses to show explicitly which operations are grouped. (We have chosen the grouping of non-commutative factors to go from left to right, however this does not seem to be an important choice.) On the screen this will look simply y ** A(x) ** B(y) and x*y** B(x) ** B(y) ** A(z) because we have defined the precedence of the "**" operator to be the same as that of the normal multiplication, so Yacas won't insert any more parentheses.
This canonical form will allow Yacas to apply all the usual rules on the commutative factor while cleanly separating all non-commutative parts for special treatment. Note that a commutative factor such as x*y will be multiplied by a single non-commutative piece with "**".
The basic idea behind the "canonical form" is this: we should define our evaluation rules in such a way that any expression containing A(k) and B(k) will be always automatically reduced to the canonical form after one full evaluation. All functions on our new objects will assume that the object is already in the canonical form and should return objects in the same canonical form.
First, we need to distinguish "classical" terms from "quantum" ones. For this, we shall define a predicate "IsQuantum()" recursively, as follows:
/* Predicate IsQuantum(): will return True if the expression contains A(k) or B(k) and False otherwise */ 10 # IsQuantum(A(_x)) <-- True; 10 # IsQuantum(B(_x)) <-- True; /* Result of a binary operation may be Quantum */ 20 # IsQuantum(_x + _y) <-- IsQuantum(x) Or IsQuantum(y); 20 # IsQuantum(+ _y) <-- IsQuantum(y); 20 # IsQuantum(_x * _y) <-- IsQuantum(x) Or IsQuantum(y); 20 # IsQuantum(_x ** _y) <-- IsQuantum(x) Or IsQuantum(y); /* If none of the rules apply, the object is not Quantum */ 30 # IsQuantum(_x) <-- False; |
Now we shall construct rules that implement reduction to the canonical form. The rules will be given precedences, so that the reduction proceeds by clearly defined steps. All rules at precedence X will benefit from all simplifications at earlier precedences.
/* First, replace * by ** if one of the factors is Quantum to guard against user error */ 10 # (_x * _y)_(IsQuantum(x) Or IsQuantum(y)) <-- x ** y; /* Replace ** by * if neither of the factors is Quantum */ 10 # (_x ** _y)_(Not(IsQuantum(x) Or IsQuantum(y))) <-- x * y; /* Now we are guaranteed that ** is used between Quantum values */ /* Expand all brackets involving Quantum values */ 15 # (_x + _y) ** _z <-- x ** z + y ** z; 15 # _z ** (_x + _y) <-- z ** x + z ** y; /* Now we are guaranteed that there are no brackets next to ** */ /* Regroup the ** multiplications toward the right */ 20 # (_x ** _y) ** _z <-- x ** (y ** z); /* Move classical factors to the left: first, inside brackets */ 30 # (x_IsQuantum ** _y)_(Not(IsQuantum(y))) <-- y ** x; /* Then, move across brackets: y and z are already ordered by the previous rule */ /* First, if we have Q ** (C ** Q) */ 35 # (x_IsQuantum ** (_y ** _z))_(Not(IsQuantum(y))) <-- y ** (x ** z); /* Second, if we have C ** (C ** Q) */ 35 # (_x ** (_y ** _z))_(Not(IsQuantum(x) Or IsQuantum(y))) <-- (x*y) ** z; |
/* Commutation relation */ 40 # OrderBA(A(_k) ** B(_l)) <-- B(l)**A(k) + Delta(k-l); 40 # OrderBA(A(_k) ** (B(_l) ** _x)) <-- OrderBA(OrderBA(A(k)**B(l)) ** x); /* Ordering simple terms */ 40 # OrderBA(_x)_(Not(IsQuantum(x))) <-- x; 40 # OrderBA(A(_k)) <-- A(k); 40 # OrderBA(B(_k)) <-- B(k); /* Sums of terms */ 40 # OrderBA(_x + _y) <-- OrderBA(x) + OrderBA(y); /* Product of a classical and a quantum value */ 40 # OrderBA(_x ** _y)_(Not(IsQuantum(x))) <-- x ** OrderBA(y); /* B() ** X : B is already at left, no need to order it */ 50 # OrderBA(B(_k) ** _x) <-- B(k)**OrderBA(x); /* A() ** X : need to order X first */ 50 # OrderBA(A(_k) ** _x) <-- OrderBA(A(k)**OrderBA(x)); |
In> OrderBA(A(k)**A(l)) Error on line 1 in file [CommandLine] Line error occurred on: >>> Max evaluation stack depth reached. Please use MaxEvalDepth to increase the stack size as needed. |
50 # OrderBA(A(_k) ** _x) <-- OrderBAlate(A(k)**OrderBA(x)); 55 # OrderBAlate(_x + _y) <-- OrderBAlate(x) + OrderBAlate(y); 55 # OrderBAlate(A(_k) ** B(_l)) <-- OrderBA(A(k)**B(l)); 55 # OrderBAlate(A(_k) ** (B(_l) ** _x)) <-- OrderBA(A(k)**(B(l)**x)); 60 # OrderBAlate(A(_k) ** _x) <-- A(k)**x; 65 # OrderBAlate(_x) <-- OrderBA(x); |
100 # VEV(_x) <-- VEVOrd(OrderBA(x)); /* Everything is expanded now, deal term by term */ 100 # VEVOrd(_x + _y) <-- VEVOrd(x) + VEVOrd(y); /* Now cancel all quantum terms */ 110 # VEVOrd(x_IsQuantum) <-- 0; /* Classical terms are left */ 120 # VEVOrd(_x) <-- x; |
Finally, we try some example calculations to test our rules.
In> OrderBA(A(x)*B(y)) Out> B(y)**A(x)+Delta(x-y); In> OrderBA(A(x)*B(y)*B(z)) Out> B(y)**B(z)**A(x)+Delta(x-z)**B(y)+Delta(x-y)**B(z); In> VEV(A(k)*B(l)) Out> Delta(k-l); In> VEV(A(k)*B(l)*A(x)*B(y)) Out> Delta(k-l)*Delta(x-y); In> VEV(A(k)*A(l)*B(x)*B(y)) Out> Delta(l-y)*Delta(k-x)+Delta(l-x)*Delta(k-y); |