Rubberduck: Split Rubberduck into separate Libraries?

Created on 22 Mar 2017 · 16Comments · Source: rubberduck-vba/Rubberduck

Rubber Duck Team,

_I'm well aware this is probably 100% outside of the scope of Rubberduck, but..._

Would it be worthwhile to split Rubberduck into separate libraries/projects? For example:

a simple library dedicated for parsing VBA Code with no dependencies on VBIDE,
a library for analyzing the results and returning metrics / evaluating rules (Linq?),
and Rubberduck proper that implements the Addin, UI, Refactoring, Unit Tests, etc.

This would help others who are interested in developing Static Analyzer & Code Quality tools without them (me 😁) having to re-invent the turbo-charged engine that Rubberduck put together.

I know Rubberduck is currently being developed as multiple projects within a solution, but it appears to be fairly coupled to other projects within Rubberduck making difficult to part out. Not to mention there's no formal documentation on how it works.

Note: Yes, I am aware of SonarQube and VBDepend but they're not free ☹️.

-robodude666

discussion enhancement status-declined

Source

robodude666

Most helpful comment

Even though the MockVbeBuilder and its pears can be used to run RD without an actual instance of the VBEIDE, I think we should decouple the parser/resolver more. I anyway plan to pay some technical debt there.

Although the parser ultimatively needs the code modules (in memory) to parse them, we could provide them via some IModuleCodeProvider. Something similar could be done for the precompiler directives.

Actually, I think extracting the parser itself and removing all COM references from it is not that much of a problem, although the devil is usually in the details. Moreover, the reference loader is already separated from basically everything, thanks to @comintern. I have never really looked at that part of the code, but I think it is not at all using the VBEIDE inspect the type libraries.

The thing where I see the biggest problem to decouple the entire parsing and resolving engine from the VBEIDE is that basically everything depends on IVBComponent for various reasons. However, maybe we can replace them with a key and store a mapping in some class so that functionality really requiring access to the component can get it there.

All in all, I think we can and should decouple the parsing engine quite a bit more, both from the IDE and internally.

MDoerner on 22 Mar 2017

👍2

All 16 comments

They say great minds think alike... Splitting RD into a plugin architecture has actually been on the books since before 2.0 was released, and is currently slated for 3.0--expect it to be released in ~~~6-8 weeks~~~ maybe a year or two. Meanwhile, you could just contribute to the project as it is; come visit us in our chat room.

Hosch250 on 22 Mar 2017

Note: this is basically covered in CodeNameCucumber, but I think it's good to have this in the project to remind us:+1:

Vogel612 on 22 Mar 2017

a simple library dedicated for parsing VBA Code with no dependencies on VBIDE

"Parsing VBA code" is hardly simple. Precompiler directives can use constants that only exist at project level, that even the VBIDE API doesn't expose. And then suppose you can get a parse tree and even a symbol table - you still need to resolve identifier references if you're going to want to be able to analyze anything! And in order to know (reliably) that MyAwesomeClass.Create(42, "foo", "bar") is calling the Create method of MyAwesomeClass which has a VB_PredeclaredId attribute set to True, you need to be picking up that VB_PredeclaredId attribute value somehow - and you can't do that without first pre-processing the exported module. That's one complication: there are countless others. Range calls and implicit default member calls come to mind.

If VBA code was self-contained and didn't reference half a dozen arbitrary COM libraries - heck scratch that, it doesn't even need to reference the libraries it uses (CreateObject -> late-bound dependency)... Aside from a very limited and error-prone "parser" that Rubberduck couldn't use at all, I don't see how the parser could live on its own without requiring at least interfaces to the VBIDE API... which is exactly what Rubberduck.Parsing.dll is: a separate class library that uses interfaces defined (and implemented) in Rubberduck.VBEditor.dll, which wrap the entire VBIDE API; if you want a reliable on-the-side parser, you need to provide an implementation for all these interfaces somehow - these interfaces are eventually going to allow us to make Rubberduck run as a VB6 add-in. But stand-alone processing?

In the past couple of years I have learned too much about how VBA code really works to say that's a possibility.

a library for analyzing the results and returning metrics / evaluating rules (Linq?)

Without identifier references resolved, there's nothing to analyze. Full stop.

You simply can't have Rubberduck's analytical power without its "turbocharged" engine.

The plug-in architecture we're going to be discussing involves, among other things, exposing API's to work with Rubberduck's understanding of the code. But no, it will not be able to work "without a dependency on VBIDE".

retailcoder on 22 Mar 2017

I might have misread something.

By working with Rubberduck's [future] plug-in API, RD plug-ins will definitely be able to use ALL of Rubberduck's analytical power and code-rewriting capabilities, without a direct dependency on the VBIDE API.

retailcoder on 22 Mar 2017

@retailcoder By simple I mean easy-to-use APIs.

My statement regarding VBIDE was more along the lines of, "The parsing and analysis should be capable without a physical instance of Visual Basic for Applications running." i.e. if I have a bunch of files bas and cls files on disk, I should be able to tell Rubberduck API to analyze them and tell me how maintainable my code is, etc, ala VBDepend.

As it currently stands, from my understanding, Rubberduck.VBEditor is a COM Wrapper for VBIDE.dll, et al. and without Rubberduck running as a COM AddIn it has no way of getting to the code or knowing where code is because it's dependant on Sections defined by CodeModule instances, etc.

This request proposes separate libraries to be able to do stuff like:

```C#
var project = new VBProject();
project.AddFile("C:\MyProject\Module1.bas");
project.AddFile("C:\MyProject\Class1.cls");

var rules = new List();
rules.AddRule(new OptionExplicitRequired());
rules.AddRule(new NoTrailingSpaces());
rules.AddRule(new NoGoToStatements());

var parser = new VBParser(project, rules);
var results = parser.parse();

for(var method in results.Methods)
{
Console.WriteLine("Method {0} has CC of {1}.", method.Name, method.CyclomaticComplexity);
}

for(var brokenrule in results.BrokenRules)
{
var where = brokenrule.location;
var msg = string.Format("Rule {0} broken on Line {1} of file {2}.",
brokenrule.Description, where.line, where.file);
Console.WriteLine(msg);
}
```

That'd be pretty slick, eh?

robodude666 on 22 Mar 2017

@robodude666 that looks very much like the MockVbeBuilder API we [ab]use in RubberduckTests.dll - replace IRule with IInspection and you pretty much have what Rubberduck-WEB is already doing - running RD inspections outside the VBE, out of a textbox on a web page:

````csharp
[HttpPost]
public Task GetInspectionResults(string code)
{
//Arrange
var builder = new MockVbeBuilder();

// ensure line endings are \r\n
code = code.Replace("\r\n", "\n").Replace("\n", "\r\n");
var vbe = builder.ProjectBuilder("WebInspector", ProjectProtection.Unprotected)

                 .AddReference("VBA", MockVbeBuilder.LibraryPathVBA, 4, 1, true)
                 .AddReference("Excel", MockVbeBuilder.LibraryPathMsExcel, 1, 7, true)
                 .AddReference("Office", MockVbeBuilder.LibraryPathMsOffice, 2, 5, true)
                 .AddReference("Scripting", MockVbeBuilder.LibraryPathScripting, 1, 0, true)

                 .AddComponent("WebModule", ComponentType.StandardModule, code)
                 .MockVbeBuilder().Build();
var mockHost = new Mock<IHostApplication>();
mockHost.SetupGet(m => m.ApplicationName).Returns("Excel");
vbe.Setup(m => m.HostApplication()).Returns(() => mockHost.Object);

var path = Server.MapPath("~/Declarations");
var parser = MockParser.Create(vbe.Object, _state, path);
parser.State.AddTestLibrary(path + "/VBA.4.2.xml");
parser.State.AddTestLibrary(path + "/Excel.1.8.xml");
parser.State.AddTestLibrary(path + "/Office.2.7.xml");
parser.State.AddTestLibrary(path + "/Scripting.1.0.xml");

try
{
    Task.Run(() => parser.Parse(new CancellationTokenSource())).Wait();
}
catch (Exception e)
{
    Console.WriteLine(e);
}
if (parser.State.Status >= ParserState.Error)
{
    return Task.FromResult(PartialView("InspectionResults", null));
}

var results = _inspector.Inspect(parser.State);
return Task.FromResult(PartialView("InspectionResults", results));

}
````

Now, the website is designed to list and run all inspections, not just a cherry-picked bunch - you can't new up an inspection just like that, they have their dependencies, which themselves have their own dependencies, which can also have their own dependencies - that's why we use Ninject to new everything up for us:

````csharp
private static void BindCodeInspectionTypes(IKernel kernel)
{
var assembly = Assembly.GetAssembly(typeof(InspectionBase));
var inspections = assembly.GetTypes().Where(type => type.BaseType == typeof(InspectionBase));

foreach (var inspection in inspections)
{
    kernel.Bind<IInspection>().To(inspection).InRequestScope();
}

}
````

But yeah, what you're asking is already technically possible.

retailcoder on 22 Mar 2017

👍1

So, Retailcoder beat me to the punch, but his response does bring another question to mind. Should those mocks be moved and renamed "InMemory..."?

rubberduck203 on 22 Mar 2017

@rubberduck203 maybe - but wrapping the mock API without taking away any functionality would be quite a feat.

retailcoder on 22 Mar 2017

That's interesting. Didn't know about MockVbeBuilder or RubberduckWeb's online inspector.

robodude666 on 22 Mar 2017

All in all, I think we can and should decouple the parsing engine quite a bit more, both from the IDE and internally.

MDoerner on 22 Mar 2017

👍2

It would be fantastic to natively support a 100% in-memory analysis without having to Mock stuff, as it does appear to be fairly slow to execute.

Rubberduck already has interfaces for all of the VBEditor dependencies, as well as SafeComWrappers implementations. It should certainly be possible to add another set of implementations intended for "in memory" use that has no dependency on the COM objects.

If Rubberduck is launched in a COM environment, it will use the SafeComWrappers but if you try to start InMemoryVbeBuilder it will default to the "manual" implementations.

robodude666 on 22 Mar 2017

@robodude666 oh, it runs in a few milliseconds in our test suite - I've no idea what's taking so long on the website really.

retailcoder on 22 Mar 2017

Which reminds me that I need to take some time to look at that...

rubberduck203 on 22 Mar 2017

@rubberduck203 it might have something to do with us creating all inspections and the per-request scoping of dependencies, but the parser state must be per-request, and it's a dependency of quite a few things IIRC. IDK, could be the xml-deserialization taking longer than we think.. Ideally there would be feedback for every step of the process..

retailcoder on 22 Mar 2017

Extracted enough code from RubberduckWeb into a standalone project to make GetInspectionResults work. Parsing the Test(foo) example takes ~9s :(. I'll eventually figure out which part is the bottleneck.

robodude666 on 22 Mar 2017

9 seconds is about what I get locally when debugging RD-Web; no idea why it goes up to 45+ seconds once deployed though. Some inspections have O(sh!t) complexity.. we're working on improving that. Likely it's one or two inspections (probably the Excel-specific ones) dragging everything down.

retailcoder on 22 Mar 2017

Was this page helpful?

0 / 5 - 0 ratings