xmlHelpline Blog
Xml, Xslt, data standards, and anything else...









AWS CloudFormation: reluctantly embracing YAML


Yet Another Markup Language?  Really?
That was my view of YAML for a very long time.  As someone who has no problem with various formats for structured information like xml, json, etc, I couldn't see any real use case for making another one.  And then came AWS CloudFormation.
AWS CloudFormation uses templates for provisioning assets in the cloud.  I'd been working on a problem and it involved creating a Lambda function.  This in tern means creating and managing an IAM Role and Policy.  It also played into some S3 for storage (trying to go full serverless).  And so the CF template was quickly getting sizable.  And since there is no CF schema to work with (although there are attempts at such), there is no authoring "helper" that can be derived from the governing schema.  Which is why one can author easily in Xml no matter how large because the schema can help you and authoring tools.
Having used JSON for a long time (and even JSON Schema too), I was working with that syntax in CF.  The tools support for JSON is good and make it easier to work with, especially features like code folding.  But when you scale JSON larger and larger, it starts to become unwieldy.  Don't get me wrong, I've dealt with JSON files many megabytes in size before.  So it isn't just a matter of size. With large JSON files that are consistent and repetitive, you can fold code/copy code and manage it not unlike Xml.
However with CF, I was getting frustrated at the process of brute force trial an error I was involved in.  (Ok, I know there are tons of template snips out there - and I was using them - but nothing did EXACTLY what I was trying to do.  So they were all "sorta" useful.)  I finally broke down and said, I'll try something new just to see if it has value.  And I dug into YAML for CF templates.  Structured, hierarchical, fold-able code is very familiar, so the literals of YAML were no problem.  But was it any better than JSON?
Enter Notepad++.  This tool brought the features needed to make YAML work just like JSON and Xml.  First and foremost code folding.  So I was more easily able to take code snippets and work them into my CF template.  Testing along the way.  And I began to like YAML.  Eminently more readable and more compact in terms of fitting code on a viewable page, it allowed me to visualize more of what I was authoring.  That has proven key to my progress.  And the explicit formatting of white space allows for easy viewing of indentation.  It was starting to look a lot like Python.
So I've come around to this new "Python-like" markup known as YAML.  It does have a use case that I can get behind.  Call me a reluctant embracer of YAML.

Certified Solutions Architect


Passed the exam today.  Really pleased at reaching this goal.  I guess I've gone from knowing just enough to be dangerous to being actually dangerous!   The test was challenging but also about what I'd expected.   My road to certification started with a great course from Ryan Kroonenburg of ACloudGuru.  He not only knows his stuff, but he is an excellent communicator.  It can sometimes be hard to get both in one person.  I also had some key resources that helped.  First, beyond the course itself, was the official study guide from AWS.  It was originally published in October 2016, so some parts are dated.  But for the most part it was an excellent resource.  I also read that the folks who wrote the book also wrote the exam (haven't verified this myself).  Second, I bought some quizzes from Whizlabs.  They have 8 full length tests that are challenging and informative.  They help identify weak points to study.  In addition, I got some tests from IAASAcademy.  Those were also good, although I thought the Whizlabs were better.  And lastly (but certainly not least in importance) was the AWS documents, including the FAQs for each service.  Very useful resources for sure.  (Oh and of course the labs.)
So now that I hit my goal, it's time to do some damage.

Escapades in AWS Certification


Adventures in cloud computing.  Like many folks, I've been reading about this new game changer for some time.  And been experimenting with some of the tools.  S3 storage for static web hosting: check.  EC2 instances for managing compute tasks: check.  Even trying some auto scaling which is one key item that make so much sense.  I think I'd learned just enough to know I was dangerous.
I had some ideas as to how I might like to make use of cloud technology. Migrate some of my tools to AWS and leverage Lambda functions in the design of how they would work.  Lambda is the game changer with in the game changer.  I can see the amazing potential.  Wouldn't need to start out with it at first, but eventually get there.
After playing around in my own sandbox, I concluded there is no better way to learn than to get dirtier and get certified.  So I embarked on a certification exam for Solution Architect.
One can't research too long before finding that A Cloud Guru stands out as a leader in the teaching of this material.  I found a very inexpensive course on udemy.com and started my adventure.  More to come. 

Xml, JSON, and Darwinian competition

Recently gave a presentation on the relationship between JSON and Xml technologies.  I'd set it in the context of "friend or foe" as there are lots of people who frame the relationship between these two as some sort of competition in a zero sum game or a Darwinian death match.  On the one hand, Xml as the incumbent who is trying to fend off the nipping upstart, saying that JSON simply isn't a king killer.  On the other hand, the insurgent JSON is poised to topple the bloated, over-the-hill, yesterday technology.  Wresting the title from reluctant dinosaurs.

Having worked in both data integration as well as content management spaces, I've seen both natures and how they react to Xml and JSON.  I think the former are very hot on the JSON track, and rightly so.  With cloud applications, bandwidth is now an issue again.  And then there's mobile applications.  Light weight, simply data structures for not overly complex data can make a huge beneficial difference.  So JSON will continue to have an increasing role there.

The content folks see value but are a little less keen on the the JSON value proposition. An example of some skepticism is the concept mixed content (elements intermingling with text).  This is a big, bright line that differentiates the two technologies.  Having tried several methods to work with this myself, I find that Xml's inherent support for mixed content is a really nice relief.  And content management will tend to run into mixed content more frequently than data integration specialists.  Still, content folks see some value in JSON for sure.  They don't like Xml's bloat any more than anyone else.

Ultimately however, this isn't a death match.  The Darwin analogy doesn't mean there can be only one survivor.  But an array of creatures that each have their strengths and weaknesses.  Like programming languages or Galapagos island animals, there is room for many.  I like JSON and find it very useful and fast for development.  I've developed JSON applications and experimented with JSON Schema in fact.  (More to come on this in another post.)  And when I come into contact with complex content structures or mixed content of any kind, I'm glad Xml is still in the toolbox.

Bitcoin - is there any "there" there?

Reading quite a bit recently about the technology that is popularly known as Bitcoin.  The use of block chain computing power to solve mathematical problems in return for money, to put it bluntly.  Article after article spoke to how this can be a trans-formative technology.
Fair enough.  Time to investigate and see how it is supposed to make people money.  I thought of 2 angles to try out.  First, the easiest is to think of it technologically.  And use computing power to test out how things work and how useful it proves.  Seems the things to do are setting up a wallet (after all I need a wallet to store all my major bucks right?).  Then start "mining" the math to create my way to wealth.  I installed Bitcoin Core wallet for windows.  Installed and seems to be about what I expected and read about.   Next I installed GUI Miner which is a client that does the computing. So if I'm mining for bitcoin gold, where do I land my first shovel?  In order to find a place to squat and stake a claim, its best to follow someone who knows.

So enter Slush Pool.  Pools are ways of aggregating computing power with a shared reward. Slush's Pool claims to be the world's first mining pool.  Its at least a place to start.  Soon I'm set up with a wallet and I'm using GUI miner to mine coins in Slush's pool.  So I sit back and let my computer make me money, right?!  Seeing the early returns, it's clear that it will take a very long time to make any money this way.  Can I reduce my overhead to maximize my margins?  

Researching pools, one quickly gets into issues of governance.  The competition to attract miners leads to claims of transparency and low cost pool providers.  (An interesting view that money creates government instead of the other way around. :) )  Being an advocate of a Vanguard investment philosophy, I view the strategy of keeping overhead low and I'll beat the higher cost guys most of the time without even trying.  But it becomes apparent the global nature of this setting.  I'm not only choosing a pool that may claim to have low overhead fees, but my mining efforts are competing against third world cost structures.  Calculators spring up to tell you how your costs affect your mining potential.

In fact, in mining, the calculations of benefit soon become a discussion of your electricity rates.  Since coins are minted using computing power, one needs to factor in electricity costs in your potential profit margins.  But since "1"s and "0"s do not recognize borders, my first world electricity costs quickly mean I'm competing against inherently cheaper places around the world.  This means I'm starting to sour on raw mining for profit.  The margins simply aren't there unless I can employ an armada of machines at third world electricity rates.

So I've learned about the technology that makes it work and I've learned that the mechanics of mining mean one will never get rich that way and this task is better left to low overhead miners.  What about a more philosophical or entrepreneurial view?  (Meaning I want to own my own pool.)  Where is the opportunity to put this technology to something different or in line with goals.  Can it be leveraged to solve a bigger problem?  I'd like to see this applied to something useful like fighting malaria or some kind of important goal.  Pools to simply make money are obvious and already exist.  Another one won't stand out.  Creating a pool that attracts investors (miners) for some motivation other than simply making money might do the trick. Indeed some of the pools are motivated by philosophies that attract a certain motivated miner.  This remains my landing point in this story. 

I'm left intrigued with Bitcoin (and the underlying block chain technology) even if I've not found a path to riches nor used it to solve a bigger problem.  The fact that it is making some inroads to mainstream usage and acceptance means it isn't a fad.  The technology is interesting and I can understand the attraction.  So there is some "there" there.  I'm just not sure where this fits into my strategies as yet.  Perhaps you'll find me next announcing a new mining pool that will plow all profits into fighting malaria.

replace front disc brake pads 1997 Toyota Camry

I've been in car maintenance mode as you can probably tell.  This time, its been a long standing issue. Quite frankly these brakes have been squeaking ever since I got the car.  Very annoying and actually embarrassing when driving friends.  I was told when I got it that the brakes were not that old and because of the squeaking the mechanic put in the exact OEM pads for this car.
So are they just worn out?  Are the slider pins needing lubrication? Something else?  As it turns out I think the problem was the pads.  They were not worn down all the way.  But they are metallic pads.  I switched to ceramic and this made all the difference.  Here is how I did it.

1997 Toyota Camry radiator replacement

Here is another item in the "anything else" category.  Recently had some car trouble and make a short video of a repair I did.  I have a Toyota Camry 1997 and I was getting P0115 error codes from my ODB 2 reader.  As I was about to replace the ECT coolant temperature sensor, the radiator showed to be leaking.  So I ended up replacing the radiator.  This video shows how I did it.

URIResolver with XSLT2 using Saxon on Tomcat and JSTL

Ran into a rather maddening problem last week.  I was working on a front end to a tool and was planning on using JSP within a Tomcat environment.  I'd downloaded the latest Tomcat (8.0.9 to be exact).  It installed ok.  Well most of my app is xml based and I needed to use XSLT 2.  So grabbed Saxon 9 (9.1.0.2J - later tried 9.5.1.7J and same behaviour), and added to my lib directory and with an environment variable update, presto - I was able to perform transformations.  (Just needed a property set in the JSP)
<%
System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");
%>
So far a happy story right?  The issue came up around relative and absolute paths.  The collection() function was throwing errors if I tried to use a relative path.  It was annoying but not the end of the world as I could supply the full path behind the scenes.  Maddeningly, the doc() function was throwing an error if I used an absolute path.  So I had 2 functions that were each doing their own thing at different places in the tool.  But one required full path and one relative.  No exceptions.  I could work around this but it seemed silly to have to do this.

I wasn't sure if the problem was my code, java, tomcat, or saxon (can you guess which?).  I found that it wasn't anything to do with encoding, so I could rule that out.  I started doing research and found some interesting (though dated) discussions here , here , here .  The issue was apparently around the URIResolver. Potential work arounds/solutions here , here , here , here .  

I also did some document-uri(.) functions on loaded documents.  It reflected this problem, as it was returning a path that started with "jstl:/../" instead of  "file:///c:/" or even "http://localhost".  So the resolver was definitely the problem.

Just as I was about to contemplate writing a custon URIResolver, I did some more digging in my JSTL tagging.  And it hit me that I might have outsmarted myself.  Turns out that @xsltSystemId not only provides the path for the XSLT, but also serves as the basis for all relative URLs used in the XSLT. So things like imports, doc(), collection(), etc.  They all are based on that.  So my solution was a humbling, simple attribute on my JSTL:

<x:transform xml="${thexml}" xslt="${xslt}"
    xsltSystemId="${basepath}select-schema.xslt"/>

****postscript****

Here is more info on the errors I'd gotten:

When a known and correct relative path was used, the collection() function resulted in this error (snipped for brevity):
HTTP Status 500 - javax.servlet.ServletException: javax.servlet.jsp.JspException: net.sf.saxon.trans.XPathException: Cannot resolve relative URI: Invalid base URI: Expected scheme-specific part at index 5: jstl:: jstl:
Meanwhile when the full path is given, and the collection() function works correctly, later on in the process, the same full path in doc() function returns this error (also snipped):
HTTP Status 500 - java.lang.IllegalArgumentException: Expected scheme-specific part at index 5: jstl:
 

java.lang.IllegalArgumentException: Expected scheme-specific part at index 5: jstl:
    java.net.URI.create(Unknown Source)
    java.net.URI.resolve(Unknown Source)
    net.sf.saxon.functions.ResolveURI.tryToExpand(ResolveURI.java:115)
    net.sf.saxon.StandardURIResolver.resolve(StandardURIResolver.java:165)
    




The Definitive Guide to TMS Top 5 Lists

This goes into the "anything else" category. One of my other passions is rock and roll music. And many people who hang around the worlds of hard rock and heavy metal are aware of a VH1 classic TV show called "That Metal Show" (@ThatMetalShow  #TMS). Hosted by Eddie Trunk, Don Jamieson, and Jim Florentine, the show has become a focal point of discussion, awareness, and fun around this music genre.

The show has numerous segments, but the one that struck me is the "Top 5" lists. Host selected topics are debated and a "final" list is determined. This blog post is to show you how diligent (or perhaps crazy) I've taken interest in these lists. I've researched the shows and found that no where was there a definitive list of Top 5 lists. So I created one myself!

Here is the link to The Definitive Guide to TMS Top 5 Lists. You can use this blog post to add comments (or to help me track down the few episodes for which I cannot find the list). Enjoy!

great XSLT function site

I want to give a shout out to Priscilla Walmsley's list of XSLT functions.  I have used it many times and find the site easy to read and consume.  Especially when I'm wrapping my head around some nasty namespace management issues, I find myself coming back to the site over and over.

Thanks Priscilla for this great XSLT resource!

http://www.xsltfunctions.com/xsl/

send me your worst use case! SchemaLightener update underway

Am working in updating the code to the #SchemaLightener (which also flattens schemas and wsdl files) to use XSLT 2.0 and other enhancements.  Having offered this tool for years, I've accumulated many use cases that I can test with.  And of course, I use many consortia standard schemas and wsdls.  However, I want your worst use cases!  So I can make this tool the best it can be.  Don't worry - I won't redistribute them, so you are free to send me your ugliest cases!
Simply email me and I'll work to incorporate these use cases into the testing of this new version.
And thank you.

How I learned to love @OASISopen CAM

I've played around with @OASISopen CAM in the past, but mostly from a learning and experimentation perspective.  I thought it an interesting technology and always wanted to find a reason to use it in real life.  But for some time that opportunity wasn't at hand. 

The most powerful aspect of this is to allow data model and constraints to travel and live together. 

Schematron, (xml schema 1.1), and xslt

The problem:
We were working with industry standards in terms of xml schema data models.  These were the starting point.  There was a need to add on additional constraints.  But within the standard (i.e. co-occurence constraints which can't be put in xml schema) or outside the standard where busiensses take the standard and build their own additional constraints onto it as a base. 

Schematron to the rescue?
The most logical technology to use at the time was of course schematron (and still is of course).  The problem I was trying to solve was that schematron was understood by xml geeks like myself.  However, the people who knew the business rules were speaking an entirely different language.  So the only choices either to have the xml geeks be the translator or to create a translation tool for business people to use.  In one sense I was doing both.  I created an interface to simplified schematron specifically for business analysts.  By creating a simple interface, then business people could input simple rules without a problem.  No middle man.  But any more complex rules could only be roughed in and the xml geek would then need to step in and translate the business rules into schematron patters.

The interface started out as absolute simplicity.  It was "if ... then" at its core.  If this business condition exists, then some other rule applies.  At it's simplest, 2 element names were the minimum point needed.  And it was put in a simple HTML web form, an interface BAs were very familiar with.  The web form would take the input and generate the schematron assertions as well as use the schematron skeleton template to create an XSLT that would validate xml with these assertions.  At that time the skeleton was used widely because no ANT task or native schematron processor was in place.

The simplicity of this approach was both its biggest strength and of course also its biggest weakness.  Only a rudimentary knowledge of xml and a business analyst could create schmatron compliant rules by simply identifying 2 components of this "if ... then" assertion.  Scmeatron and XSLT validator came out the other end automagically.  But of course complex assertions defied this method and so the xml expert had to intervene and help the BA formulate the rules. 

This worked for its limited aims.  But we still had the problem of separate technologies for validation.  It would be best to have schematron embedded directly into the schema.  Indeed this is what eventually what xml schema 1.1 would enable.

Enter @OASISopen CAM
Working with CAM on a consulting gig moved me from seeing it not just as an interesting technology to one that I like.  One big benefit is that all validation rules can be put in one place.  Content model, data typing, co-occurence, or any other constraint could travel together.

Secondly, there was a CAM processor that provided a relatively easy to use interface to creating assertions.  Not quite as simple as my earlier effort, but also not as limiting as it was as well.  So business analysts can do some of the work in creating assertions, although xml knowledge is of course essential. 

While I'm still working with it and learning its warts, I've come to appreciate CAM.  And of course one can't mention CAM without a shout out to David Webber.




#OAGIS X reviewed - part 2

This is part two of a review of OAGIS X release candidate 1 from OAGi.  (twitter: @OAGi_Standards)  See review part one.

One of the interesting aspects of this release is around extensions.  The 2 main extension methods you've seen in previous releases are still there.  The <UserArea> element is still ubiquitous.  And it is by design the last element in a content model in all cases except where there is a type derivation.  This handy element has been one of the mainstays of working with a standard in the real world of sometimes messy and custom data.  Secondly, the elements in release X are still almost all globally scoped.  This enables one to use the substitutionGroup extension method that is also common and goes back to version 8 of OAGIS.

What is different about this release is that it makes management of extensions easier rather than employing some new engineering gadget.  Very practical.  To begin, there is an Extensions.xsd file which centralizes your extension management.  This file is the link between custom and standard content models.  It is where the UserArea global element is defined.  But there are changes from there.  The UserArea is defined as "AllUserAreaType".  This type is a convenince yet again as it extends the "OpenUserAreaType" (see below) with a sequence that ships as empty.  So one can simply add in elements to this sequence and instantly have widely impacted additions to the UserArea content model.  Nice.

    <xsd:element name="UserArea" type="AllUserAreaType"/>

    <xsd:complexType name="AllUserAreaType">
        <xsd:complexContent>
            <xsd:extension base="OpenUserAreaType">
                <xsd:sequence><!-- easy to insert extensions here --></xsd:sequence>
            </xsd:extension>
        </xsd:complexContent>
    </xsd:complexType>
Next, as mentioned, there is the OpenUserAreaType.  This takes the most commonly used extension elements and puts them explicitly in the UserArea.  Things like name value pairings, codes, IDs, and text can all be used out of the box here.  They will look familiar to folks working with CCTS Representation Terms and the  UN/CEFACT Core Components Data Type Catalogue. In all my years of experience, I've found these kinds of elements in the OpenUserAreaType to be the most often used in a pinch.  So again the management is easy.

Lastly, there is the AnyUserAreaType which is an xsd:any with strict process contents.  This is how the UserArea used to be defined in previous releases.  In this release, it is a type that can be employed as needed.  In fact the UserAreaType is one of these "any" definitions.  However it is important to note that the UserArea global element is not defined as "UserAreaType" but as "AllUserAreaType" which extends "OpenUserAreaType".  So be sure to keep that straight and you'll have lots of possibilities for managing extensions at your fingertips.


#OAGIS X reviewed - part 1

This is part one of a review of OAGIS X release candidate 1 from OAGi. (see part 2)

The first part that is crucial to understand is that the semantics remain intact.  Indeed version X is a non-backwardly compatible major release.  So one might have concerns about changing data models.  However, at the high level, there was not any kind of large scale re-modeling of the data models on the nouns, bods, or verbs.  This will be comforting to those who have invested in previous version of OAGIS and who may be worried about what has been done with a new release.  In fact, I was able to create a valid xml instance document from a valid 9.4 version document without much work.

When comparing instance documents, the first area for change you'll notice is in attributes.  Specifically around codes and identifiers.  For example, the LogicalID element in version 9.4 looks like this:

<LogicalID
schemeID=""
schemeAgencyID=""
schemeVersionID=""
schemeName=""
schemeAgencyName=""
schemeURI=""
schemeDataURI=""
></LogicalID>

Whereas the X version is streamlined looking like this:

<LogicalID
schemeID=""
schemeAgencyID=""
schemeVersionID=""
></LogicalID>

This effectively pushes values for agency name and such into basically a lookup table.  Not an unreasonable approach.  Generally I've not seen those attributes being used anyway.  So this simplification is good.  Certainly when generating sample documents or dealing with the schema as a data model, the existence of all these extra attributes was often cumbersome.  And similarly to this ID, the same streamlining was done on codes as well.  Strike one for simplification.

The second thing you might notice is the base namespace is the same.  "http://www.openapplications.org/oagis/9" is still the default namespace in BODs.  I haven't checked with the folks at OAGi about whether this should be "10" instead of "9" but I'll assume that by the time the bundle is released in the production version this would be changed.  In the short term though, this makes working between 9.4 and X that much easier (or harder in some cases).  I've kept them separate and so don't have collision problems, but if they must coexist you'll need to manage that carefully.  It is a candidate release.

I've got more to say so I'll return and update you on this important new release of OAGIS.
(see part 2)

© Copyright Paul Kiel.

Older posts