June 2006
My Projects
This page is shows information about projects that I am current working on outside of my day job as Senior Manager at BearingPoint.
Famiva - where families live online
Famiva is a private social network, genealogy and collaboration platform designed for families. Built on Ruby on Rails and Java, Famiva sports powerful yet easy-to-use features that allow families to connect, share photos and stories, keep up with events, etc. Famiva is a free service. Learn more by visiting Famiva.com or the Famiva Blog.
Enterprise Ruby
I am working on two Ruby projects aimed at improving enterprise support for Ruby, both targeted towards XML and web services.
RDIL
RDIL stands for Requirements Definition and Illustration Language. It is a domain-specific language for defining web application requirements using natural-language syntax. Using RDIL, you can define application structure, navigation, look and feel and layout, data types and other behavior using a simple RDIL source file. RDIL Generators are able to take this source file and generate HTML prototype, implementation specific code and other artifacts. Visit the RDIL RubyForge project page for downloads and more information.
ROXML
ROXML is a Ruby Objects to XML mapping library. With simple annotations, it allows Ruby objects to be marshalled back and forth from XML with custom mapping. This way, developers can create Ruby objects according to their Object Oriented design, and XML support without having to work with low-level Hash maps and XML elements. More information about ROXML can be found on the ROXML homepage and the project page.
Uddi4r
Uddi4r provides Universal Description, Discovery and Integration (UDDI) inquiry services for Ruby with the aim of supporting dynamic discovery in Service-Oriented Architectures. My goal with this project is make Ruby and especially Rails be first-class citizens in the SOA tools space. More information about Uddi4r can be found on the uddi4r project page. You can learn about uddi4r at the Enterprise Ruby Conference 2007 in Columbus Ohio.
Designing Databases with ActiveRecord Conventions
Based on Ruby on Rails Philosophy and ActiveRecord Default Naming Standards
Underlying Philosophy
The goal of IT systems is to provide support the business objectives. IT departments have been guilty of focusing too much on their individual function, such as database administration or Java development, digressing into engineering perfect and losing focus of the bigger IT objectives. New conventions are emerging in the software development industry that are breaking old monolithic standards and promoting a more holistic view of the software development functions, with the goal of efficiently delivering business value.
Databases serve a specific purpose in the enterprise: to store and retrieve data so that user-facing services, such as application and business intelligence tools, can meet the business needs efficiently. In other words, if a database design reduces the development time for building a specific application or report while meeting the business objectives, than it should be designed as such, even if it doesn’t follow old (and often unnecessary) conventions. Ruby on Rail’s ActiveRecord framework supports such a convention; it promotes a common-sense approach to building databases that accelerate application and report development while meeting long-term business objectives.
A note on Oracle limitations
Oracle identifiers (table/column names) are limited to 30 characters, which creates unwarranted tendency to abbreviate identifiers, and create artificial and often cryptic naming conventions. If you are concerned about Oracle’s column name length, you’ll be surprised how much you can fit into 30 characters by using good logical names. In many cases, a very long column name is a sign of normalization problem:
- FIRST_INVOICE_CREATED_ON
- CONTRACTOR_POINT_OF_CONTACT
What about Internationalization (I18N)?
The objective of these conventionss is not to eliminate the need for translation of labels. These naming conventions will help you standardize around names that are logical, readable and consistent. Depending on your development environment, it may even help your tool translate your database objects names into your application or report development namespace.
1. Table Names
Convention #1.1 – Naming tables
A database row represents one instance of an entity, while a table is a collection of the same entities. Use plural nouns, e.g. PEOPLE, CONTRACTS, etc for database tables - this follows the common database convention. Use underscore to separate words, e.g. LINE_ITEMS. Do not abbreviate. Do not start table names with a number – this causes errors in certain application platforms. Do not use prefixes like REF_ or LU_ for reference tables, entity names should represent the information being stored, and not try to represent a usage scenario for the information.
Alternatively, you can use the singular form of entity names for tables, thereby following the Object Oriented Programming naming style and avoiding the need for translation from singular to plural and vice versa. ActiveRecord defaults to plural table names, this can be turned off as a global environment setting.
Recommended naming:
- PEOPLE
- PERSON
- LINE_ITEMS
- ORDER_TYPES, TYPES_OF_ORDERS
Not recommended:
- LINEITEMS
- LN_ITEMS
- LI
- REF_ORDER_TYPE
- ORDER_TP_CDS
Convention #1.2 – Naming association tables
For association tables (also known as intersection tables) that resolve many-to-many relationships, use the following naming convention: [name of first entity]_[name of second entity]. This may not always be possible, given Oracle’s limitation. Use it whenever possible, it makes it easier to identify the physical translation of a logical relationship.
Recommended naming:
- PRODUCTS_CATEGORIES
Not recommended:
- PROD_CATEGORY_INT
- PRODUCT_CATEGORY_XREF
Convention #1.3 – Proper use of schemas
Do not put application prefixes on table names for business data – this tightly couples business data to applications. Business data is a function of the business, not the application. There is always a logical name for business entities. Use schemas for the intended purpose of creating namespaces, to logically group relates entities and avoid naming conflicts.
Do not use a master schema that holds all tables, it is an anti-pattern and defeats the purpose of schemas.
Recommended naming:
- ACCOUNTING.ACCOUNTS
- PHONE_BOOK.POINTS_OF_CONTACTS
Not recommended:
- ACCOUNTING_ACCOUNT
- ACT_ACCOUNTS
- PB_POCS
There are other obvious advantages of using schemas for logical grouping. For examples, schema level security policies simplify access control, storage settings can be defined at the schema level. In general, schemas and all objects within schema can be administered using schema-level operations.
Use a separate schema for development and testing, this way your test database can be wiped out or stages for automated unit tests and for testing reports. Use “_DEV” and “_TEST” suffix with the schema name to differentiate. Production schema name should match the logical name; it should not have a suffix.
Primary Keys and Foreign Keys
Convention #2.1 – Using numeric primary keys
Always use auto incremented numeric primary key. For consistency and simplicity, I recommend ‘ID’. It is a noun as defined in English language as a synonym to identity . Also, some frameworks, such as ActiveRecord, use ID as the default name for primary key.
Use auto increment (sequences) as a general rule, unless you want the application tier to manage the keys. Even the most obvious business keys can change, for example, DUNS number for identifying partners. If and when the business key changes, all the relationships will break. Business keys are pieces of business data that should have unique constraints.
Using numeric IDs makes database indexes more efficient. Strings, for instance, take more space to store than numbers. This space is saved in index storage, the space saving is propagated to all child tables’ foreign keys.
Recommended naming:
- ID (INT) AUTO INCREMENT
Not recommended:
- CUSTOMER_ID
- TYPE_CODE
- EMAIL_ADDRESS
- SSN
- ISBN
- DUNS
Convention #2.2 – Using single column keys
This convention is an extension of 2.1. Avoid concatenated keys at all costs. Concatenated keys complicate indexes and all DML statements. They take more space to store and index. The storage need and SQL complexity multiplies with every child table (the entire multi-column key has to be stored in every child table). Moreover, concatenated keys make object relational mapping complicated and error-prone. Follow this convention in conjunction with 2.1, use single integer primary keys – auto incremented whenever possible.
Recommended naming:
- ID (INT) AUTO INCREMENT
Not recommended:
- ORDER_DATE + CUSTOMER_ID
- FIRST_NAME + LAST_4_SSN
Convention #2.3 – Naming foreign keys
This may sound like a violation of convention #2.2 and general principle of using logical names. The logic behind this convention is that all relationships can be named as one of the three types of associations:
- Entity A belongs to Entity B (child to parent)
- Entity A has zero, one or many Entity Bs (parent to child)
- Entity A has many and belongs to many Entity Bs (many to many)
If you use the above listed logical names for foreign key naming, the column name will not tell you which entity it is associating the entity with. This becomes a bigger problem when you have a child table with multiple parents, e.g. an order_items table could have a foreign key to an order and discount type.
ActiveRecord recommends using the following convention: [parent entity name]_ID. The advantage of using this convention is that you can tell the parent entity, direction and type of the association.
There is a downside to this naming convention; the column name does not tell you the nature of the association. For example, an employee can have two circular relationships, “reports to employee” and “mentored by employee.” In these kinds of scenarios, you can either extract the foreign key/keys into an association table, or use the following convention: [nature of association]_[name of parent entity]_ID.
I don’t feel strongly about this convention, feel free to adopt any one standard and use it consistently; keep in mind that several Object Relational Mapping can infer the relationship if you follow the ActiveRecord convention of [parent entity name]_ID.
Recommended naming:
- ORDER_ID
- REPORTS_TO_EMPLOYEE_ID
Not recommended:
- FK_ORDER
- ORDER
- REPORTS_TO_EMPLOYEE
Column names
Column names should be the logical name of the attribute. A good rule to follow is that the column name should be obvious to the end users; there should be no need to translate the name before it is presented in a report of application. This simple principal will not only add clarity to your database schemas, it will alleviate the need for maintaining application labels (unless your application supports multiple languages). You will also alleviate the need for meta-data for ad hoc reporting tools.
The remaining conventions are an extension and elaboration of this general principal.
Convention #3.1 – Naming columns
Underscores serve as a friendly visual delimiter for words. Some databases allow spaces and mixed case in table names but most popular databases suppress this capability by default. For a standard approach that can be applied to all databases, use upper-case table names with underscores to separate words.
Frameworks like ActiveRecord can translate logical column names into object attributes and/or user interface labels using the following rules:
- First letter of each word will be capitalized
- Underscore is translated to a space and used as a delimiter for words
As an example, a column named “FIRST_NAME” will translate to “First Name” on the user interface.
Recommended naming:
- FIRST_NAME
Not recommended:
- FIRSTNAME
- F_NAME
- FIRST_NM
Convention #3.2 – Avoid table names in column names
The table name specifies the context for the columns. Prefixing column names with table name is redundant. If you find a column name to be ambiguous without the table name, you may want to review your data model for normalization.
Recommended naming:
- CUSTOMERS.NAME (Table CUSTOMERS Column NAME)
Not recommended:
- CUSTOMERS.CUSTOMER_NAME
(Table CUSTOMERS Column CUSTOMER_NAME)
Convention #3.3 – Avoid data type names in column names
Column data type is specified by the column definition in the table. Using type identifiers in the column name are redundant. If you are referring to a code, use convention 2.3. For date/time columns, ActiveRecord recommends using the logical proposition form.
Recommended naming:
- CREATED_ON (Date)
- CREATED_AT (Date + time)
- STATUS_ID
Not recommended:
- DATE_CREATED
- CREATE_DATE
- STATUS_CD
Convention #3.4 – Avoid abbreviations in column names
Do not use abbreviations that are not universal accepted. A good standard is to avoid all abbreviations except those that are defined in one of the official English Language Dictionaries. For example, abbreviations like SSN and ID are acceptable but DT, NUM and CD are not.
Recommended naming:
- ID (for Primary Key Only)
- URL
- FIRST_NAME
- LAST_4_DIGITS_OF_SSN
Not recommended:
- IDENT
- FIRST_NM
- LST_4_SSN
Setting up Intel Mac for Rails development
I am happy to report back that I had no problems switching to MacOS as my development server. Moreover, MacOS interface felt just natural - as if I always knew how to use it. It could be because of all the things that Ubuntu and GNOME has taken from MacOS, like the preferences dialog, and privileged commands (sudo). I also felt readily productive in MacOS once I figured out how to launch the terminal. Bash and VIM, my two favorite console tools were already installed. Here is a report on the tools I configured to migrate from Ubuntu:
Firefox
As of writing, Firefox official releases were only for PowerPC, which runs extremely slow emulated on the Intel platform. I had to get the bleeding edge build of Firefox Universal Binary. Fortunately, you can grab the official release now.
Ruby on Rails 1.1.1
After googling around for a few hours and trying different RoR packagers, I found a nice, non-intrusive Mac Intel package at Todd Huss’s blog. Unlike Locomotive, which is a PowerPC application that hides the underlying Rails infrastructure, Todd’s package gives you a simple directory structure with Ruby and the basic gems that you can drop anywhere on your filesystem.
MySQL 5.0
MySQL was a breeze. Just head on over to the official MySQL site and grab the Mac x86 version.
DarwinPorts
Okay, here’s where it gets interesting: Before I can install RMagick (ImageMagick) and Subversion for Intel, I need to setup DarwinPorts so that I can compile my own binaries. For DarwinPorts, I need Xcode. So, I grabbed the latest Intel compatible Xcode from Apple’s Developer Connection. Grabbed and installed Darwin Ports from here, and followed instructions on RubyForge to install ImageMagick and RMagick.
Subversion and Lighttpd
With DarwinPorts installed, installing subversion and lighttpd were a matter of single commands:
port install subversion port install lighttpd
I used svnadmin to create a new svn respository, copied my svn repository from Ubuntu, and voila! All my versioned files and change history is migrated. Don’t you love it when software just works? Hope this post was helpful to anyone trying to setup an Intel Mac as a Rails development server.
Java will learn from and adopt to Rails
Ruby on Rails is proof, complex web applications don’t have to be complicated. Big kudos to David Heinemeier Henson and the Rails committers. Thanks for the wake-up call. The question is, what do we do with Java now? In the long run, what matters is whether the developer community at large is going to learn from it and adopt it or whether it is going to dismiss Rails as a fad. Here’s the good news: Java community is no fool, it is learning and it’s learning fast. I was at JavaOne 2006 and here’s why I think Java will learn from and adopt to Rails. Like me, you can be a huge Rails fan and a die-hard rubist, but you can not deny the ubiquious Java ecosystem, nor can you deny the great innovation that has come from Java’s cross platform enterprise environment. At the risk of sounding like an apologist, I hereby claim that Java will learn from Rails and take it to the next level. Enough claims, here’s my list of reasons:
- Groovy (Java Virtual Machine’s scripting language) has the benefits of Python and Ruby, such as duck-typing, closures, method-missing, builders, etc. plus the big advantage of running in the highly optimized Java virtual machine, enjoying HotSpot technology and native threads. With the advent of scripting languages support in the JVM with Mustang (Java 6). Groovy will get a big boost in performance — all the productivity minus the interpretation. Whether Ruby enthusiasts agree or not, there is a huge performance gain in running within an intermediate language such Java bytecode. As far as I can see, there is not much information on Ruby 2.0 and it’s VM. Mustang is open, it’s on track, and it will be delivered this year. I sure hope Ruby 2.0 is chugging along, but I just can’t find any evidence online. My money is on Mustang.
- Grails, the Rails copy-cat for Groovy, is onto a great start. Graeme Rocher is certainly embracing the Rails conventions while staying pure to the Java platform by embracing Hibernate. At version 0.1, it’s already looking pretty darn impressive. Not a Rails alternative, mind you - but a great start nonetheless. RIFE and Trails deserve an honorable mentions for highly productive Java environments.
- What is the biggest advantage of Rails? For me, it’s convention over configuration; not having to write all the darn XML files that only describe the obvious. Good news: The era of useless descriptors is over. EJB 3.0 sports smart defaults. In essence, it follows the Rails convention over configuration paradigm. You no longer have to describe the obvious. As far as I am concerned, the biggest advantage of Rails over Java has just been deprecated. With that said, I still think Ruby on Rails has many other advantages over Java. Ruby’s dynamic nature and all the flexibility that it makes possible, such as open-classes, meta-programming and Rails plug-ins. RJS templates, fast and simple automated testing, and Rake are some of my favorites that have not made it to the Java world yet.
- Last but not least, Java will be open-source. I was there when Rich Green said so. My Ubuntu Linux will have a .deb package for Sun Java. Java will no longer be Sun propeitory technology. There will no longer be a need for spending life on open source alternatives such as Mono or even Ruby. Java can re-unite the open source community.
The point: Rails is wonderful. Ruby is fantastic. Java will adopt and take Ruby on Rails to the next level. That’s a few months to a year down the road. I am not holding my breath for Grails, Mustang or open-source Java. I love Ruby on Rails and I will continue to use it. But I will keep my eyes and ears open for what Java will become. You should too. Until then, viva la Rails.