Hacking up a WCF Client for a nonstandard SOAP service

by MikeHogg 12. March 2012 21:11

I set up a console shell application and framework for a team that collected data from hundreds of sources, but most of the data came from web scrapes and web services.

Setting up clients in VS usually requires just a few clicks, but when the servers are third party, and they are not using Microsoft technologies, this sometimes doesn’t work.  The VS tool will just error out with some vague message about not loading the WSDL.  Using the command line will give you some hints and sometimes you can download their WSDL to your local, and make a couple of edits, and then SvcUtil your client.

In one case in particular, even this didn’t work for me.  I was already resorting to writing custom XML requests and inspecting the responses with Fiddler to get my requests right.  I think it was some Java JBoss server, and apparently they are known for not serving a standard SOAP format.  I forget the details why...  But I knew that I could write my own DataContract and OperationContract classes and even write custom channel parsers if I had to.  They were serving lots and lots of datatypes, and methods, though, and I didn’t need but a few of them.  I had to dissect their huge wdsl file, pulling out just the Data Objects I needed and writing them by hand, instead of using the svcutil, and then running tests to find what I was missing.  I had to use XmlSerializerFormat instead of DataContractSerializer attributes for some obscure reason.

Here was my client, constructed in c# to get the requests just right:

    class MPRClient : ClientBase<IMPR>, IMPR
        public MPRClient()
            : base()
            System.ServiceModel.BasicHttpBinding binding = new BasicHttpBinding();
            binding.Security.Mode = BasicHttpSecurityMode.Transport;
            binding.Security.Transport.Realm = "eMPR Authentication";
            binding.Security.Transport.ClientCredentialType = HttpClientCredentialType.Basic;
            CustomBinding cbinding = new CustomBinding(binding);  // need to set keepalive=false or we get 505 after auth, this is one way
            foreach (BindingElement be in cbinding.Elements)
                if (be is HttpsTransportBindingElement) ((HttpsTransportBindingElement)be).KeepAliveEnabled = false;
            Endpoint.Binding = cbinding;
        public queryResponse query(queryRequest request)
            queryResponse result = Channel.query(request);
            return result;

Here are some of my Data Classes that I figured out from the testing you will see my request and response objects, and take note of how I constructed the child objects, as arrays were the only way to get the serialization to line up just right…

    /// <summary>
    /// some's empr query web service
    /// </summary>
    [ServiceContract(Name = "empr", Namespace = "http://empr.some.com/mpr/xml")]
    interface Impr
        /// <summary>
        /// </summary>
        /// <param name="queryRequest">
        /// query takes two parms- CompanyName(LSE) and Day
        /// </param>
        /// <returns>
        /// sample data you can get from this service:
        /// <PeakLoadSummary Day="2012-01-23">
        ///     <LSE>NEV</LSE>
        ///     <ZoneName>AECO</ZoneName>
        ///     <AreaName>AECO</AreaName>
        ///     <UploadedMW>70.4</UploadedMW>
        ///     <ObligationPeakLoadMW>70.064</ObligationPeakLoadMW>
        ///     <ScalingFactor>0.99523</ScalingFactor>
        ///     </PeakLoadSummary>
        /// </PeakLoadSummarySet>
        /// </returns>
        [OperationContract(Action = "/mpr/xml/query")]
        queryResponse query(queryRequest queryRequest);
    [MessageContract(WrapperName = "QueryRequest", WrapperNamespace = "http://empr.some.com/mpr/xml", IsWrapped = true)]
    public class queryRequest
        [MessageBodyMember(Namespace = "http://empr.some.com/mpr/xml", Order = 0)]
        QueryPeakLoadSummary[] Items;
        public queryRequest() { }
        public queryRequest(QueryPeakLoadSummary[] items)
            Items = items;
    [System.Xml.Serialization.XmlType(AnonymousType = true, Namespace = "http://empr.some.com/mpr/xml")]
    public class QueryPeakLoadSummary
        public string CompanyName;
        public string Day;
        public QueryPeakLoadSummary() { }
    [MessageContract(WrapperName = "QueryResponse", WrapperNamespace = "http://empr.some.com/mpr/xml", IsWrapped = true)]
    public class queryResponse
        [MessageBodyMember(Namespace = "http://empr.some.com/mpr/xml", Order = 0)]
        public PeakLoadSummarySet[] Items;
        public queryResponse() { }
        public queryResponse(PeakLoadSummarySet[] Items)
            this.Items = Items;
    [System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true, Namespace = "http://empr.some.com/mpr/xml")]
    public class PeakLoadSummarySet
        [System.Xml.Serialization.XmlElement("PeakLoadSummary", Order = 0)]
        public PeakLoadSummary[] PeakLoadSummary;
    [System.Xml.Serialization.XmlType(AnonymousType = true, Namespace = "http://empr.some.com/mpr/xml")]
    public class PeakLoadSummary
        [System.Xml.Serialization.XmlElement(Order = 0)]
        public string LSE;
        [System.Xml.Serialization.XmlElement(Order = 1)]
        public string ZoneName;
        [System.Xml.Serialization.XmlElement(Order = 2)]
        public string AreaName;
        [System.Xml.Serialization.XmlElement(Order = 3)]
        public string UploadedMW;
        [System.Xml.Serialization.XmlElement(Order = 4)]
        public string ObligationPeakLoadMW;
        [System.Xml.Serialization.XmlElement(Order = 5)]
        public double ScalingFactor;
        public DateTime Day;
        public PeakLoadSummary() { }


My client config was just a one line endpoint, since the options to set the keepaliveenabled were not available in the config, and I put it in the c# initialization:

        <binding name="pooledInstanceNetTcpEP_something else
        <binding name="OperatorInterfaceSoap" closeTimeout="00:01:00" openTimeout="00:01:00" receiveTimeout="00:10:00" sendTimeout="00:01:00" allowCookies="false" bypassProxyOnLocal="false" hostNameComparisonMode="StrongWildcard" maxBufferPoolSize="524288" maxReceivedMessageSize="65536" messageEncoding="Text" textEncoding="utf-8" useDefaultWebProxy="true">
          <readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384" maxBytesPerRead="4096" maxNameTableCharCount="16384"/>
          <security mode="Transport">
            <transport clientCredentialType="Basic"/> 
      <endpoint address="net.tcp://Tosomethingelse
      <endpoint address="https://b2bsomething else
      <endpoint address="https://rpm.pjm.com/erpm/services/query"  binding="basicHttpBinding"
                contract="jobs.IRPM" >

And then I could write business code just like normal:

        private List<queryResponse> GetResponses(List<Account> accounts, DateTime date)
            List<queryResponse> result = new List<queryResponse>();
            foreach (Account account in accounts)
                MPRClient r = new MPRClient();
                r.ClientCredentials.UserName.UserName = account.Username;
                r.ClientCredentials.UserName.Password = account.Password;
                result.Add(r.query(new queryRequest(
                    new QueryPeakLoadSummary[] { 
                         new QueryPeakLoadSummary{ CompanyName = account.Company, Day = date.ToString("yyyy-MM-dd") }
                    )));  // day must be 00 digits
            return result;



Loading log files into Oracle

by MikeHogg 8. March 2012 17:51

One of my last Oracle projects was pretty neat, because I started working with the new 11g feature, external tables.  This allowed Oracle to mount a file as a table, and was incredibly fast compared to using sqlloader, which was what we had been doing for years. 

In this case I was loading unix log files in the order of millions of rows for each daily file, by loading the external table, and then processing that table into our permanent logging table.  The data sets involved here were pretty big, and so usual manipulation like inserts for millions of rows would take hours and hours, so changing from Sql Loader to external tables saved a lot of time, but I still had a lot of inserts to make, so I added some tweaks, like dropping indices and recreating them after, and then updated stats with the new indices for Oracle’s Query Optimizer. 

Once I had the files shared on a network location accessible to this Oracle unix server, I loaded them with this proc:

  procedure LoadExtTable(filedate varchar2) is 
    execute immediate 'create table mdf_meta_activity_dump ( IP_ADDRESS VARCHAR2(255), PID NUMBER,' ||
                      'SYMBOL VARCHAR2(255), USER_ID VARCHAR2(50), APPLICATION VARCHAR2(60),' ||
                      'HOSTNAME VARCHAR2(60), SYMBOL_MESSAGE VARCHAR2(255), SYMBOL_FORMAT VARCHAR2(255),' ||
                      'SCRIPT_NAME VARCHAR2(255), PROCMON_PROCESS VARCHAR2(255), TIME_STAMP DATE )' ||
                      'organization external (type oracle_loader default directory exttabdir access parameters ' ||
                      '(RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED by ''|'' ' ||
                      ' ) LOCATION (''\someplace\somedb\udpserver\udp.txt''));';

I would process the dump with this proc, which also updated two other tables and was written to be re-runnable, so that, in case of failure or just manual mistake, running the same file of millions of rows would not result in a mess of a million duplicates. 

You will also see here Oracle bulk statements, and logging, which allowed someone to monitor the process realtime, as it usually took some minutes or hours.

  procedure ProcessActivityDump is
    cursor c_log(p_file_date date) is 
           select s.id, d.user_id, d.symbol_message, d.time_stamp, p_file_date, trunc(d.time_stamp), to_char(d.time_stamp,'M')
            from mdf_meta_symbol s
            join mdf_meta_activity_dump d
              on s.name = d.symbol
  type t_activity is table of c_log%rowtype;
  r_activity t_activity;
  v_count number; 
  v_file_date date;
    -- PROCS
    merge into mdf_meta_proc p
    using (select distinct procmon_process, script_name from mdf_meta_activity_dump) d
    on (p.procmonjob = d.procmon_process and p.script = d.script_name)    
    when not matched then 
      insert (id, procmonjob, script, active_fg, insert_date, audit_date, audit_user)
      values(seq_mdf_id.nextval, procmon_process, script_name, 1, sysdate, sysdate, 'PKG_META');
    Log_This('PKG_META.ProcessActivityDump','MDF_META_PROC new rows inserted: ' || sql%rowcount ,'INFO');
    -- SYMBOL, rerunnable
    merge into mdf_meta_symbol s
    using (select distinct symbol, p.id from mdf_meta_activity_dump join mdf_meta_proc p on procmon_process = procmonjob and script_name = script) d
    on (s.name = d.symbol)
    when not matched then 
      insert(id, name, proc_id) values (seq_mdf_id.nextval, symbol, d.id);
    Log_This('PKG_META.ProcessActivityDump','MDF_META_SYMBOL new rows inserted: ' || sql%rowcount ,'INFO');    
    select file_date into v_file_date from (
                     select trunc(time_stamp) file_date, count(*) 
                       from mdf_meta_activity_dump 
                      group by trunc(time_stamp) 
                      order by count(*) desc) where rownum = 1;
    -- delete existing activity for this day, to make rerunnable   
    delete from mdf_meta_activity where file_date = v_file_date; 
    Log_This('PKG_META.ProcessActivityDump','Dump_Date: ' || v_file_date || ' rows deleted in preparation for new load: ' || sql%rowcount ,'INFO');
    -- now add the activity, logging only every 200k or so
    -- maybe need to drop idx and recreate after
    open c_log(v_file_date);    
    v_count := 0;
    fetch c_log bulk collect into r_activity limit 1000;
    exit when r_activity.count = 0;
      forall idx in 1..r_activity.count
        insert into mdf_meta_activity
        values   r_activity(idx);
      v_count := v_count + r_activity.count;
      if Mod(v_count, 200000) = 0  then
        Log_This('PKG_META.ProcessActivityDump','Cumulative insert now at ' || v_count || ' rows','INFO');
      end if;
    end loop; 
  end ProcessActivityDump;

And that’s it.


Oracle | Automation

About Mike Hogg

Mike Hogg is a c# developer in Brooklyn.

More Here

Favorite Books

This book had the most influence on my coding style. It drastically changed the way I write code and turned me on to test driven development even if I don't always use it. It made me write clearer, functional-style code using more principles such as DRY, encapsulation, single responsibility, and more. amazon.com

This book opened my eyes to a methodical and systematic approach to upgrading legacy codebases step by step. Incrementally transforming code blocks into testable code before making improvements. amazon.com

More Here