Multivariate Chemometric Models and Application of Genetic Algorithm for Simultaneous Determination of Ledipasvir and Sofosbuvir in Pure Form and in Pharmaceutical Preparation ; A Comparative Study

Objectives: Four multivariate chemometric methods have been developed for simultaneous determination of sofosbuvir and ledipasvir in their pure and pharmaceutical dosage forms. Methods: Firstly, partial least squares and artificial neural network have been applied for the quantitative analysis of the studied drugs. Results: Experimental design of different synthetic mixtures of sofosbuvir and ledipasvir in different ratios has been done. The zero-order absorption spectra of these prepared mixtures have been recorded over the wavelength range 200-400 nm with 1 nm interval. The obtained absorbance and concentration data matrix have been utilized to obtain calibration or regression analysis data which has been used for the prediction of the unknown concentrations of each drug in their mixtures. Alternatively, the application of genetic algorithm to partial least squares and artificial neural network has been done and greatly increased the precision and predictive ability of the methods. The four methods have been successfully applied to quantify sofosbuvir and ledipasvir in the real market sample. Conclusion: The investigated methods have been found to be accurate, precise and could resolve the overlapped spectra of the mixture without any preliminary separation steps.

To the best of our knowledge there is no chemometric methods available for simultaneous determination of ledipasvir and sofosbuvir.
Hence, the aim of this work was to develop an accurate and precise chemometric methods for simultaneous determination of LED and SOF in their dosage form.The developed methods are partial least squares (PLS-1) with application of genetic algorithm (GA-PLS-1) and artificial neural network (ANN) with application of genetic algorithm (GA-ANN).

Instruments
Shimadzu UV-Vis.1800 Spectrophotometer, (Tokyo, Japan), equipped with 10 mm matched quartz cells was used.The spectral band was 2 nm and scanning speed is 2800 nm/min with 1 nm interval.
PLS, ANN and application of GA were carried out by using PLS toolbox software version 2.1.in conjugation with neural network toolbox The student t-test and F value were performed using Microsoft-Excel.
All calculations were performed using a Quad core CPU, 1.47 GHz, 4.00 GB of RAM under Microsoft Windows 7 ™.

Materials and Reagents
Pure LED and SOF were kindly supplied by Mash Premiere for Pharmaceutical and Cosmetics Industries, Third Industrial Zone, Badr City, Egypt.Their purity were (99.25 %) and (99.7 %) respectively according to the company certificates.
Pharmaceutical preparation: Sofolanork Plus® tablets (Batch no.M 169916), manufactured by Mash Premiere for Pharmaceutical and Cosmetics Industries.It is labelled to contain (400 mg of SOF and 90 mg of LED) per tablet and purchased from local pharmacies.

Standard solutions
A Standard solution of LED (450 μg mL -1 ) was prepared by dissolving 45 mg of LED in 50 mL of methanol and the volume was completed to 100 mL with methanol.Working solution of LED (45 μg mL -1 ) was prepared by transferring 10 mL of standard solution to 100 mL volumetric flask and the volume was completed to 100 mL with methanol.
A Standard solution of SOF (1 mg mL -1 ) was prepared by dissolving 100 mg of SOF in 50 mL of methanol and the volume was completed to 100 mL with methanol.Working solution of SOF (200 μg mL -1 ) was prepared by transferring 20 mL of its standard solution to 100 mL volumetric flask and the volume was completed to 100 mL with methanol.

Experimental design
A 5 levels, 2 factors experimental design was used in which 0.8, 0.9, 1, 1.1 or 1.2 mL aliquots of both LED and SOF working solutions equivalent to (36, 40.5, 45, 49.5 and 54 μg ml -1 ) for LED and (160, 180, 200, 220 and 240 μg ml -1 ) of SOF were combined and diluted to 10 mL with methanol resulting in 25 mixtures. 9The central level of the design is 4.5 μg ml -1 and 20 μg ml -1 for LED and SOF respectively.The chosen concentrations for each compound are based on their linearity and the ratio between both compounds involved in their pharmaceutical preparation.The concentrations details are given in Table 1.
12 mixtures of this design (odd numbers) were used for the calibration based on cross validation and the other 12 mixtures (even numbers) were used as a validation set to test the predictability of the proposed multivariate models.

Application of the method to pharmaceutical formulation:
Ten tablets of Sofolanork Plus ® (400/90 mg) was finely powdered and an amount equivalent to one tablet (400 mg of SOF and 90 mg of LED) was extracted three times with 25 mL of methanol, filtered into 100 mL volumetric flask then the volume was adjusted with methanol to obtain a solution labelled to contain (4000 μg mL -1 of SOF and 900 μg mL -1 of LED).This solution was diluted to obtain solution labelled to contain (400 μg mL -1 of SOF and 90 μg mL -1 of LED).The spectra of these solutions were scanned from 200 to 400 nm, stored in the computer and analysed by the proposed methods.

RESULTS AND DISCUSSION
Spectroscopic techniques can supply the analyst with a large data within a short period of time.Coupling the spectral data with chemometric models enhance the quality of the spectral information and making this combined technique into a powerful and highly convenient analytical tool.Few spectrophotometric methods have been introduced for the simultaneous analysis of SOF and LED.All these methods require manipulation either through derivatization of the absorption spectra, processing of the ratio spectra or requiring the presence of a critical point (iso absorptive point).This has prompted the authors to apply different chemometric methods, especially PLS, GA-PLS, ANN, GA-ANN for simultaneous analysis of the studied drugs.These described methods have higher prediction power, providing maximum relevant information and analyzing a large number of samples in a short period of time with higher degree of accuracy and precision.
The UV spectra of SOF and LED show certain degree of overlap Figure 3, which creates difficulty in the simultaneous analysis of this mixture.Therefore, multivariate calibration methods were applied to predict the concentrations of SOF and LED in both calibration and validation sets as well as in their pharmaceutical formulation.
GA searches the solution space of a function through the use of simulated evolution.It solves the optimization problem by exploring all regions of the potential solutions and exponentially exploiting promising areas through mutation, crossover, and selection operation applied to individuals in the populations.A critical issue of successful GA performance is the adjustment of GA parameters 10 .In order to avoid the risk of over fitting, a number of independent short runs was done and the results of all the runs were taken into consideration to obtain the final model.Doing this, a much more consistent (and less over fitted) solution can be obtained 11,12 .The adjusted GA parameters with the lowest mean square error were shown in Table 2.

Partial least squares (PLS) and applying genetic algorithm (GA-PLS)
PLS-1 is a widely used regression method.It is known that information from the concentrations values is introduced into the calculation of the so-called latent variables, which are linear combinations of the original variables.PLS-1 method was run on the calibration data of absorption spectra.To select the number of factors in the PLS-1 algorithm, a cross validation (CV) method leaving out one sample at a time was applied using calibration set of 13 calibration spectra.RMSECV (Root Mean Squared Error of Cross Validation) was recalculated upon addition of each new factor to the PLS-1.Then number of factors was selected based on Haaland and Tomas criteria 13 .It was found that two factors were sufficient for modelling both SOF and LED.However, to increase the quality and improve the calibration, the variables selection technique namely genetic algorithm (GA) was performed, by its application the un-informative variables were excluded.The predictability of both models was tested by validation set and it was found that the PLS-1 model constructed after removing the un-informative variables is more robust and simpler with lower root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP).This is surely due to the fact that the un-informative wavelengths have been excluded.The percentage % recoveries, RSD (relative standard deviation) and RMSEP values of the validation set for PLS and GA-PLS models are listed in Table 3.
The GA was run on 201 variables for SOF and LED using a PLS with the optimum number of LVs determined by cross validation on the model containing all the variables.GA reduced absorbance matrix to about 49 % of the original matrix of LED and 45 % of SOF.The whole parameters involved upon application of GA on PLS model are shown in Figures 4 and 5 for SOF and LED respectively.

Artificial neural network (ANN) and applying genetic algorithm (GA-ANN)
ANNs are a type of computational models simulating the biological neural networks.They composed of an inter-connected group of artificial neurons.To optimize a neural network, we have to use the trial and error method to find out the best neural network architecture 14,15 .Choosing the values of optimum parameters to construct the network is not an easy task because the parameters are mutually related.
The output layer resemble the concentration vector of one component.The hidden layer consists of single layer which is sufficient to solve similar or more complex problems.Moreover, more hidden layers may cause over-fitting.The hidden neurons number is one of the most important parameters among other ANN parameters that must be adjusted.This parameter is related to the converging performance of the output error function during the learning process.
Transfer function pair also an important parameter that should be adjusted carefully.Choosing of transfer function based on the nature of data to be analysed.In the present work, purelin-purelin transfer function was used due to the linear correlation between absorbance and concentration.The learning rate controls the degree at which connection weights are modified during the learning phase.The optimized parameters values of the ANN for SOF and LED were shown in Table 4.
ANNs show better RMSEP than PLS-1 which may be due to the fact that ANNs is a type of artificial intelligence where there is less chance for over-fitting than that may occur in PLS calibrations.% recoveries, % RSD and RMSEP values of the validation set for ANN and GA-ANN models are listed in Table 3.
The application of the ANN on the raw data after using the variable selection technique GA show improvement of the results.A large number of nodes in the input layer of the network (wavelengths) increases the CPU time for ANN modelling.GA allowed the use of less number of neurons (shorter training time) than those used in the network utilized the raw data

Analysis of real market sample
The proposed procedure was applied for determination of LED in presence of SOF in Sofolanork plus ® tablets.Satisfactory results were obtained in good   Training function TRAINLM agreement with the label claim.The obtained results were statistically compared to those obtained by the reported method 8 .No significant differences were found by applying two tail student t-test and F-test at 95% confidence level 16 , indicating good accuracy and precision of the proposed methods for the analysis of the studied drug in its pharmaceutical dosage form, as shown in Table 5.

CONCLUSION
In this study, accurate and precise multivariate chemometric models were developed.It was found that LED and SOF can be determined simultaneously in their tablets by using the developed methods.The developed methods has the advantages of being sensitive and inexpensive unlike HPLC procedure which is time consuming and expensive.*** Absorbance subtraction method at which a mathematically estimated factor representing the absorbance ratio (A262.4/A325)for pure LED was calculated, then this factor was used for simultaneous quantitation of LED and SOF using an equation computed at λiso (262.4 nm) 8 .
Application of GA on PLS and ANN models enhance the results with respect to RMSEP.The developed methods can be applied for routine and analysis of ledipasvir in it pure form and in tablets.

Figure 4 .
Figure 4.The whole parameters involved in application of GA on PLS model for SOF.

Figure 5 .
Figure 5.The whole parameters involved in application of GA on PLS model for LED

Table 1 . The 5-level, 2-factor experimental design shown as concentrations of the mixture components in μg mL −1 .
The shaded rows represent the validation set

Table 5 . Statistical comparison for the results obtained by the proposed methods and the reported method for the analysis of LED and SOF in Sofolanork plus ® tablets.
* No. of experimental.** The values in the parenthesis are tabulated values of t and F at (p= 0.05).