Saturday, July 26, 2008

Light Refactoring turns into Major Refactoring

Well,

As much as I'd like to keep the project simple, it hasn't been simple from day one. Due to the features I'm planning on, I figured it best if I restructured the project, the Scripting Language Foundation, into a series of projects. There's the primary Type System, the CLI Type system, and the OIL (Intermediate) Type System. These three systems are best kept separate to help simplify project management. I also just got tired of scrolling through over a thousand files in one project.

The advantage of the larger scale refactoring is quicker build times (yay) so long as I don't modify the core type system (which I'm still doing, so 'boo').

Another thing I'm hoping this will result in is the ability to allow for the system to be lightly retargetable in the future. So long as there's an alternative for the CLI type system, and the CIL in general. The abstract type system, as it is, really just defines things in general sense; the primary reasons it relies on the CLI type system are for the base-types of arrays, byref types, enumerations, structures, and so forth (a simple retarget, if another target exists). Granted there are probably very few targets that are as high level as the abstract type system specifies.

Here's hoping things go well. The refactoring presently copmiles. Instead of one quite sizable library, it's now broken down into concepts:

  1. Abstract Syntax Tree
    1. Abstract
    2. Common Language Infrastructure
    3. Objectified Intermediate Language
  2. Compilers
  3. Languages
    1. Abstract
    2. C♯
    3. Common Intermediate Language (CIL)
    4. Visual Basic.NET
  4. Linkers
  5. Transformation
  6. Translation

The Abstract Syntax Tree (AST) for the first two targets, Abstract and CLI, are more type-systems than ASTs. The third, OIL, is a type-system and malleable infrastructure that injects more code-principles than the first two.

The transformation stage will utilize information about the language as stipulated by contextual data, which is defined in the Language Abstract (a given language specifies what's supported, this information is piped through the appropriate channels that use the transformers, such as compilers and code translators). The transformation framework will utilize the limitations of the language to determine what parts of the intermediate structure need transformations applied and which ones to apply.

It will also verify that the code as presented is even valid for that language, given its limitations. Certain features of the CLI don't have workarounds, such as 'base', 'MyClass', while others do, a la Lambdas. If it can't already explicitly call a virtual method non-virtually in an appropriate context (base, current scope, et cetera), there's no amount of trickery that can allow for it.

You could, in theory, write an adapter class for such instances that would emit the proper CIL to handle the task at hand, but the usability and maintenance of such code would be... questionable at best. It's best to flag the region as invalid for that particular language. The areas marked as 'invalid' for a language are typically why two languages are not 100% interoperable between each other. Visual Basic code can't always be translated to C♯ code, same applies to C♯.

No comments: