Friday, May 30, 2014

Microsoft 70-486: Plan the application layers

Exam Objectives


Plan data access; plan for separation of concerns; appropriate use of models, views, and controllers; choose between client-side and server side processing; design for scalability

Quick Overview of Training Materials


ASP.NET MVC 4 Models and Data Access
MSDN - Entity Framework (EF) Documentation
MSDN - Guiding Principles For Your ASP.NET MVC Applications
Implementing the Repository and Unit of Work Patterns in an ASP.NET MVC Application
MSDN - The Unit of Work Pattern and Persistance Ignorance
What is ASP.NET MVC? 80 minute technical video for developers, building NerdDinner
MSDN - Action Filters in ASP.NET MVC
Pros/cons between emphasizing client-side or server-side processing
Performance and scalability techniques 101
Looking back: How I'd build a large-scale web app today versus how I did it 10 years ago

Plan data access


Planning data access entails several components. First is the data access framework that will be used. A number of different frameworks can be used with ASP.NET MVC, including ADO.NET, Entity Framework, LINQ to SQL, and NHibernate.  Data access can be made more loosely coupled to business logic by using the repository pattern. Using a repository pattern separates data access from business logic and facilitates the use of mock repositories for unit testing.  Finally, the object relational mapper (ORM) pattern creates a link from objects in the application business model to entities and relationships in the database [OReilly].  Based on its use in pretty much every tutorial I find, Entity Framework appears to be the most common ORM used with ASP.NET MVC.

This graphic illustrating the repository pattern appears in both the Exam Ref and OReilly texts.


Entity Framework can be implemented using one of three conceptual models:  Model First, Code First, Database First.  Database first is used when there is already a database in place that the application will use. With Database First, Entity Framework will reverse engineer the existing database and create objects that map to the tables, views, and stored procs already there.  In Visual Studio, this results in a generated Entity Data Model.  Model First reverses this process: you first create the Entity Data Model in Visual Studio, and use that to generate the database schema.

Code First can be used to go either way, though it seems most common to use Code First to generate a database.  With Code First, we (or Entity Framework) create a number of classes to model the data we will be using. These classes do not depend on EF and are called Plain Old CLR Objects (POCOs).  These classes can have scalar properties that correspond to database fields, and they can have navigation properties that correspond to relationships [link]. 

Both the Database First and Model First approaches to data access result in the creation of a special Entity Framework configuration filed with an .edmx extension.  This file stores information about the database schema and how it maps to the conceptual data model in the application.  With code first, however, EF does not use a configuration file and instead generates the database schema dynamically.

Because the Code First approach uses POCOs that do not inherit from the EntityObject base class, it is possible to implement persistence ignorance (PI) only with the Code First approach.  With PI, the business model includes no code for connecting to databases or other means of persistence.

Plan for separation of concerns


Separation of concerns is a design principle in which abstraction and encapsulation are used to isolate pieces of code.  These pieces of code have their own small set of responsibilities, or concerns.  Examples of SoC include web pages where structure, appearance, and behavior concerns are separated into HTML, CSS, and JavaScript files respectively. SoA also appears in architecture design with presentation, service, business, and data layers all focused on their own responsibilities.  When done right, it is possible to achieve loosely coupled code where any one layer or module is not dependent on the exact implementation of other components, making maintenance, testing, and reuse much easier.  In MVC, separation of concerns is related to the three primary components of the framework: the Models, the Views, and the Controllers.

Each of these three components has the following responsibilities:
  • Model - implements business logic, interacts with data layer to populate business objects for the controller.
  • Controller - intercepts user input and requests, coordinates between the model and the view, and executes application logic.
  • View - renders the application for the user, renders models received from controller. This is the HTML portion of a web application.
While it is possible to separate some concerns, other concerns (called Crosscutting Concerns) will span layers.  Authentication, caching, logging, and state management are just a few of the crosscutting concerns that might be encountered.  Crosscutting concerns in ASP.NET MVC are often implemented using ActionFilters [link].

Appropriate use of models, views, and controllers


Seperation of concerns segues nicely into the discussion about the approapriate use of models, views, and controllers in MVC.  One common admonition in the tutorials I've worked through is that each piece should do precisely what it is meant to do and no more.  There should not be a lot of logic in the view, this should be in the model and the controller. Data access code is best kept to a DbContext class rather than directly in the controller. But this is getting ahead a little bit; first we need to understand what each component is SUPPOSED to do...

The Model is an object that represents a business entity. In a music store app, albums, artists, and genres could all be represented by models. Data access is also generally accomplished with a model, which is really just a plain class, unlike controllers and views which have specific dependencies.  Also unlike controllers and views, models are not confined by convention to a specific folder in the application. They don't even have to be in the same assembly; in fact, putting models in a separate assembly makes it possible to reuse them across multiple applications.  In a case where a view needs specific information from multiple models (as in the pluralsight tutorial when they were listing restaurants but also wanted a review count), a viewmodel is appropriate.  A viewmodel is a model that is specifically tailored to the needs of a certain view.

The Controller intercepts incoming requests and implement the behavior of the application.  When a user makes a request to the application, the routing engine directs this request to the appropriate controller.  Within the controller are a number of Actions, all of which have a name.  Actions are the methods that are called by the incoming request.  Two actions with the same name may behave differently based on which Http verb was used in calling the controller. For example, if I call the Create action on the Person controller with a GET request, I may be sent to a form view that allows me to enter the information. Hitting submit will then send this information in a POST request to the same controller, but the controller will use a different Create action that writes the information to a database and sends me to an Index view so that I can see the newly entered data. View and Redirect to another action are two possible ActionResults that can come from an action. It is also possible to send JSON data, files, or redirect to another controller or even another page. It is in these actions that the controller will instantiate Models before passing them to a View as an ActionResult.

The View is the portion of the application that end users see.  This is where the HTML lives along with view engine code like Razor or ASP, however views are not limited to HTML; views can also present information in PDF, XML, or spreadsheet format.  The only concern for views is the display of information.  There are several types of views: standard views, partial views that are intended to be embedded within a standard views, and in Razor, layout views that act as a template for multiple pages.  A standard view is used for the base page, which might use partial views along with AJAX to update small bits of the page, and the whole thing is wrapped in the layout view with includes the <head> tag, navigation, and a footer that are all repeated from page to page.

One concept that is important to understand to use ASP.NET MVC effectively is convention over configuration.  What this means is that instead of explicitly configuring where controllers and views live and what their names are, the MVC framework knows where to find them based on folder location and filename.  Controllers live in the Controllers folder and are all suffixed with "controller", and the views that correspond to a controller live in a subfolder under Views with a name matching the controller.

According to the OReilly text, a common pattern used in web development is the use of the "Front Controller".  In ASP.NET MVC, the MVC runtime actually fulfills this role, while in Spring MVC this is handled by the Dispatch Servlet. The front controller is responsible for routing.  Routing is the process by which URLs are parsed and the appropriate controllers and actions are called.

Choose between client-side and server side processing


A point made in the pluralsight tutorial (OdeToFood) was that ASP.NET MVC is designed, out of the box, to degrade gracefully, which means that certain client/server decisions don't really need to be made, such as with validation.  On clients that have JavaScript enabled, clientside validation will occur, otherwise the data- attributes related to validation are ignored and all the validation occurs on the server side.  It's worth nothing that with regards to validation, client side validation is a user experience feature, NOT a security feature. Validation provided for security purposes should always be done on the server, since the cilent side is outside of your control, and can be manipulated.

The server vs client question won't always be answered for us, however.  In cases where there is a great deal of intense calculation to be done, it may not be practical to do it all on the server if many users might be using the application at once (which speaks somewhat to scaling as well).  If processing can be moved to the client without negatively effecting user experience, then doing so will make the application much more scalable.  The flip side of that argument, however, is that we don't necessarily know if a user has the appropriate capabilities.  Whereas a user with a high performance desktop may easily handle crunching a large dataset on the client side, a mobile use would see a significant performance lag.

Another consideration is the security implications of exposing core business logic to the user.  If the business logic is inherently sensitive, then server side processing is an absolute must.  The security of data connections is another consideration. Unless the intent is to expose your data, deliberately, through REST services or other apis, the it is less risky to handle data and services on the server side. Another consideration with data is bandwidth usage: passing huge chunks of data to the client so that the client can process it is really not saving you anything, just trading network usage for processor usage.  Keeping processing close to the data will avoid unnecessary traffic. [link]

Design for scalability


The exam ref describes two kinds of scaling: horizontal scaling, which involves adding more servers to host your app, and vertical scaling, which involves making the one machine your app lives on more robust.  The exam ref also notes that while database scaling won't always effect connectivity with the web application, it is possible, giving the example of two databases that live in two separate servers (with separate connection strings). The webforefront article also include performance tuning of the application as being a method to achieve scalability, which would include steps such as configuration changes and code refactoring.

Going back to horizontal and vertical scaling. Vertical scaling is the simpler of the two, since it only involves adding resources to the existing node.  Because it doesn't involve any architecture changes, the web application is not effected by this kind of scaling. However, it does have it's limitations.  For one, there is only so much one can do to upgrade a server.  Operating systems have limits on memory that can be installed (generally 4Gb for 32 bit OSs and 32Gb for 64 bit OSs), and even with a better OS it is possible that the hardware may not support the maximum resources or use them effectively even if it does.  In a web application, the limitation on TCP ports can also be a bottleneck on a single machine.  Management and security might actually benefit from separating tiers off a single node, particularly the data and business logic tiers since the website and the database generally have completely different management tools, and it is harder to break into both simultaneously if they are on different machines.


Horizontal scaling requires giving thought to how to handle resources that may need to be shared across servers.  Horizontal scaling creates affinity problems: session data must be synced between servers or a user must be explicitly handed off to the same server for the duration of a session (server affinity).

One effective way of scaling the business logic of an application is to decouple it into separate subsystems. This is called service oriented architecture, as each subsystem acts as a service that can be aggregated with other services to fulfill the business logic of the application.  These services often take the form of REST or SOAP based web services. Static content is easier to scale horizontally because it does not change rapidly and thus has less need for synchronization, and no problem with affinity. Load balancers can be used to facilitate scaling static and business logic content, though avoiding the need for them will create less long term hassle.

Horizontal scaling of the permanent data tier is most easily accomplished by using an out-of-the-box database cluster like Oracle RAC or MySql Cluster. Another method is sharding, which involves breaking a table into pieces and spreading it out over several nodes.  

1 comment: